Skip to content
Snippets Groups Projects
make_bandwidth_test_page.py 26.8 KiB
Newer Older
###############################################################################
# (c) Copyright 2023 CERN for the benefit of the LHCb Collaboration           #
#                                                                             #
# This software is distributed under the terms of the GNU General Public      #
# Licence version 3 (GPL Version 3), copied verbatim in the file "COPYING".   #
#                                                                             #
# In applying this licence, CERN does not waive the privileges and immunities #
# granted to it by virtue of its status as an Intergovernmental Organization  #
# or submit itself to any jurisdiction.                                       #
###############################################################################
import argparse
import jinja2
import matplotlib.pyplot as plt
import pandas as pd

plt.ioff()

WWW_BASE_URL = "https://cern.ch/lhcbpr-hlt/UpgradeRateTest"

REPORT_TEMPLATE = jinja2.Template("""
<html>
<head></head>
<body>
<p>
    slot.build_id: $$version$$<br>
    platform: $$platform$$<br>
    hostname: $$hostname$$<br>
    cpu_info: $$cpu_info$$
</p>
<ul>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/run.log">Logs</a></li>
</ul>
<p>
    Results per working group and stream:
    <ul>
    <li>Inclusive retention and rate</li>
    <li>(Jaccard) similarity matrix</li>
    <li>Average DstData size and bandwidth</li>
    <li>Average event size and bandwidth</li>
    </ul>
</p>
<p>
    Results per line: all of the above, plus
    <ul>
    <li>Exclusive retention and rate</li>
    <li>Descriptives (whether persistreco and/or extra outputs is enabled)</li>
    </ul>
</p>
<p> See: <a href="https://lbfence.cern.ch/alcm/public/figure/details/32">RTA Workflow</a> for reference figures regarding bandwidth.</p>
{{HLT2_OR_SPRUCE_TEMPLATE}}
<p>
    Other results are shown by plots or tables (in the links) below. <br>
</p>
<object type="image/png" data="lines_per_wg.png"></object>
<p>
    The number of selection lines per working group. <br>
    "Other" category contains those lines with a parsed name that doesn't belong to any known WG. <br>
    To make lines properly categorized, one should follow the naming convention,
    name of lines should start with `Hlt2/Spruce[WG]_`.
</p>
<object type="image/png" data="hist_rate.png"></object>
<p>
    Distribution of rate of selection lines. <br>
    The total distribution is shown as a stacked histogram, split into several histograms of WGs. <br>
    The distributions per WG is attached in the html page below. <br>
    A line is considered to be "problematic" if it has a rate of 0 Hz
    or larger than 1 kHz, which requires some attention. <br>
    The rates of all lines are listed in a html page attached below. <br>
</p>
<object type="image/png" data="hist_dst_size.png"></object>
<p>
    Distribution of DstData RawBank size of selection lines. <br>
    The total distribution is shown as a stacked histogram, split into several histograms of WGs. <br>
    The distributions per WG is attached in the html page below.
</p>
<object type="image/png" data="hist_tot_size.png"></object>
<p>
    Distribution of total event size of selection lines. <br>
    The total distribution is shown as a stacked histogram, split into several histograms of WGs. <br>
    The distributions per WG is attached in the html page below. <br>
    A line is considered to be "problematic" if its DstData size or total event size
    is larger than 1 MB, which requires some attention. <br>
    The event sizes of all lines are listed in a html page attached below. <br>
</p>
<object type="image/png" data="hist_dst_bandwidth.png"></object>
<p>
    Distribution of bandwidth computed from DstData RawBank size. <br>
    The total distribution is shown as a stacked histogram, split into several histograms of WGs. <br>
    The distributions per WG is attached in the html page below.
</p>
<object type="image/png" data="hist_tot_bandwidth.png"></object>
<p>
    Distribution of bandwidth computed from total event size. <br>
    The total distribution is shown as a stacked histogram, split into several histograms of WGs. <br>
    The distributions per WG is attached in the html page below. <br>
    Currently, a line is considered to be "problematic" if its bandwidth from DstData size
    is larger than 200 MB/s, which requires some attention. This is a temporary limit. <br>
    The event sizes of all lines are listed in a html page attached below. <br>
</p>
<object type="image/png" data="memory_consumption.png"></object>
<p>
    Memory consumption as functions of Wall-time. <br>
    The virtual memory size is the total amount of memory the process may hypothetically access. <br>
    The resident set size (RSS) is the portion of memory occupied by the run that is held in main memory (RAM). <br>
    The proportional set size (PSS) is the private memory occupied by the run itself plus the proportion of shared memory with one or more other processes. <br>
    As we only launch one test at the same time, PSS should be close to RSS in this case, and PSS gives the real memory that is used by this test. <br>
    Swap memory is used when RAM is full. <br>
    The maximum resident set size usage is $$max_rss$$ GB. <br>
    The maximum proportional set size usage is $$max_pss$$ GB. <br>
</p>
<ul>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/other_lines.html">Show list of lines in "Other" category</a></li>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/plots_per_wg.html">Show plots split by WGs</a></li>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/all_rates.html">Show rates, event sizes and bandwidths of all lines</a></li>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/similarities_jaccards.html"> Show similarities Jaccards of different stream configurations</a></li>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/rates_streaming.html"> Show rates of streams under different configurations</a></li>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/line-descriptives.html"> PersistReco and ExtraOutput for selection lines</a></li>
    $$comparison$$
    </b></b>
</ul>
<p> Additional results for HLT2 Bandwidth test (not available for Sprucing test) </p>
<ul>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/line-rates-split-production.html"> Split by production stream: rates, event sizes and bandwidths of all lines</a></li>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/line-rates-split-wg.html"> Split by working group: rates, event sizes and bandwidths of all lines</a></li>
HLT2_REPORT_TEMPLATE = jinja2.Template("""<p>
    The bandwidth test was run under 3 streaming configurations: streamless (all lines written to the same output file), production-stream and wg-stream. <br>
    The definition of the production streaming and working-group streaming can be found below.
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/hlt2-production-stream-config.json">Production-stream configuration</a></li>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/hlt2-wg-stream-config.json">WG-stream configuration</a></li>
    The production stream configuration reflects the streaming we will have for data taking. <br>
    The rates, event sizes and bandwidths results from production-stream configuration is: <br>
</p>
{{table_5stream_rates}}""")

SPRUCE_REPORT_TEMPLATE = jinja2.Template("""<p>
    The bandwidth test was run under 2 streaming configurations: streamless and one stream per WG. <br>
    The definition of per-WG-stream configuration can be found below.
</p>
<ul>
    <li><a href="{{WWW_BASE_URL}}/$$dirname$$/wg-stream-config.json">wg-stream configuration</a></li>
</ul>
<p>
    The wg-stream configuration is close to what we will have for data taking. <br>
    The rates, event sizes and bandwidths results from wg-stream configuration is: <br>
</p>
{{table_wgstream_rates}}""")

TABLE_OTHER_LINE_TEMPLATE = jinja2.Template("""
<p>
    List of line names that categorized to "Others".
</p>
{{table_other_lines}}
""")

PLOTS_PER_WG_TEMPLATE = jinja2.Template("""
<p>
    Plots of rates, event sizes and bandwidths for lines, split into different WGs.
</p>
{{plots_per_wg}}
""")

ALL_RATE_TEMPLATE = jinja2.Template("""
<p>
    Rates, event sizes and bandwidths of all lines, listed descending in retention rates. <br>
    The results are obtained by a per-event analysing under 5-stream configuration. <br>
    These numbers are also saved in a csv file: <a href="{{WWW_BASE_URL}}/$$dirname$$/rates-for-all-lines.csv">rates-for-all-lines.csv</a>
</p>
""")

known_working_groups = [
    "B2CC",
    "B2OC",
    "BandQ",
    "BnoC",
    "Calib",
    "Calo",
    "Charm",
    "DPA",
    "HLT",
    "IFT",
    "Luminosity",
    "PID",
    "QCD",
    "QEE",
    "RD",
    "RTA",
    "Simulation",
    "SL",
    "Tagging",
    "Tracking",
]


def make_plots_per_wg(wg_name,
                      rate_list,
                      dst_size_list,
                      tot_size_list,
                      dst_bandwidth_list,
                      tot_bandwidth_list,
                      process="Hlt2"):
    '''
    Make plots of rates and event sizes for each WG.

    Arguments:
        wg_name: name of the working group
        rate_list: list containing rates of all lines from the WG
        dst_size_list: list containing DstData Rawbank size of all lines from the WG
        tot_size_list: list containing total event size of all lines from the WG
        process: either `Hlt2` or `Sprucing`
    '''

    # Make a plot of line rates
    fig = plt.figure()
    plt.hist(rate_list, 100, range=(0, 1000))
    plt.xlabel("Rate [Hz]")
    plt.ylabel("Number of lines")
    plt.title(f"Rate of {process} lines from {wg_name} group")
    plt.savefig(f"tmp/Output/hist_rate_{wg_name}.png", format="png")
    plt.close(fig)

    # Make two histograms of event sizes
    fig = plt.figure()
    plt.hist(dst_size_list, 100)
    plt.xlabel("Event size [kB]")
    plt.ylabel("Number of lines")
    plt.title(f"DstData RawBank size of {process} lines from {wg_name} group")
    plt.savefig(f"tmp/Output/hist_dst_size_{wg_name}.png", format="png")
    plt.close(fig)

    fig = plt.figure()
    plt.hist(tot_size_list, 100)
    plt.xlabel("Event size [kB]")
    plt.ylabel("Number of lines")
    plt.title(f"Total event size of {process} lines from {wg_name} group")
    plt.savefig(f"tmp/Output/hist_tot_size_{wg_name}.png", format="png")
    plt.close(fig)

    # Make two histograms of bandwidth
    fig = plt.figure()
    plt.hist(dst_bandwidth_list, 100)
    plt.xlabel("Bandwidth from DstData size [MB/s]")
    plt.ylabel("Number of lines")
    plt.title(
        f"Bandwidth from DstData RawBank of {process} lines from {wg_name} group"
    )
    plt.savefig(f"tmp/Output/hist_dst_bandwidth_{wg_name}.png", format="png")
    plt.close(fig)

    fig = plt.figure()
    plt.hist(tot_bandwidth_list, 100)
    plt.xlabel("Bandwidth from total event size [MB/s]")
    plt.ylabel("Number of lines")
    plt.title(
        f"Bandwidth from total event size of {process} lines from {wg_name} group"
    )
    plt.savefig(f"tmp/Output/hist_tot_bandwidth_{wg_name}.png", format="png")
    plt.close(fig)


def make_plots(rate_dict,
               evt_size_dict,
               bandwidth_dict,
               tot_rate,
               tot_bandwidth,
               process="Hlt2",
               wgs=known_working_groups):
    '''
    Make plots of rate and event sizes of all lines.
    It will create three stacked histograms containing distributions of all lines,
    and a pie chart showing the number of lines per WG.

    Arguments:
        rate_dict: dictionary of line names and their rates
        tot_rate: total rate of all lines
        evt_size_dict: dictionary of line names and their event sizes
        process: either `Hlt2` or `Sprucing`
        wgs: list of working groups to categorize
    '''

    # Count number of lines and rates/evt sizes per WG
    if process == "Hlt2": name_index = 4
    elif process == "Sprucing": name_index = 6
    num_lines_per_wg = {wg: 0 for wg in wgs}
    rates_per_wg = {wg: [] for wg in wgs}
    num_lines_per_wg["Other"] = 0
    rates_per_wg["Other"] = []
    list_other_lines = []
    for k, v in rate_dict.items():
        found_wg = False
        name = k.split("_")[0][name_index:]
        for wg in num_lines_per_wg.keys():
            if name.startswith(wg):
                found_wg = True
                num_lines_per_wg[wg] += 1
                rates_per_wg[wg].append(v)
        if not found_wg:
            num_lines_per_wg["Other"] += 1
            rates_per_wg["Other"].append(v)
            list_other_lines.append(k)
    keys_to_remove = [
        wg for wg in num_lines_per_wg.keys() if num_lines_per_wg[wg] == 0
    ]
    for key in keys_to_remove:
        num_lines_per_wg.pop(key)

    dst_size_per_wg = {wg: [] for wg in wgs}
    tot_size_per_wg = {wg: [] for wg in wgs}
    dst_size_per_wg["Other"] = []
    tot_size_per_wg["Other"] = []
    for k, v in evt_size_dict.items():
        found_wg = False
        name = k.split("_")[0][name_index:]
        for wg in num_lines_per_wg.keys():
            if name.startswith(wg):
                found_wg = True
                dst_size_per_wg[wg].append(v[0])
                tot_size_per_wg[wg].append(v[1])
        if not found_wg:
            dst_size_per_wg["Other"].append(v[0])
            tot_size_per_wg["Other"].append(v[1])
    dst_bandwidth_per_wg = {wg: [] for wg in wgs}
    tot_bandwidth_per_wg = {wg: [] for wg in wgs}
    dst_bandwidth_per_wg["Other"] = []
    tot_bandwidth_per_wg["Other"] = []
    for k, v in bandwidth_dict.items():
        found_wg = False
        name = k.split("_")[0][name_index:]
        for wg in num_lines_per_wg.keys():
            if name.startswith(wg):
                found_wg = True
                dst_bandwidth_per_wg[wg].append(v[0])
                tot_bandwidth_per_wg[wg].append(v[1])
        if not found_wg:
            dst_bandwidth_per_wg["Other"].append(v[0])
            tot_bandwidth_per_wg["Other"].append(v[1])

    # Sort the wg in number of lines
    num_lines_per_wg = {
        k: v
        for k, v in sorted(num_lines_per_wg.items(), key=lambda x: x[1])
    }
    rates_per_wg = {k: rates_per_wg[k] for k in num_lines_per_wg.keys()}
    dst_size_per_wg = {k: dst_size_per_wg[k] for k in num_lines_per_wg.keys()}
    tot_size_per_wg = {k: tot_size_per_wg[k] for k in num_lines_per_wg.keys()}
    dst_bandwidth_per_wg = {
        k: dst_bandwidth_per_wg[k]
        for k in num_lines_per_wg.keys()
    }
    tot_bandwidth_per_wg = {
        k: tot_bandwidth_per_wg[k]
        for k in num_lines_per_wg.keys()
    }

    # Make a pie plot of lines per WG
    labels = ["%s (%d)" % (k, v) for k, v in num_lines_per_wg.items()]
    fig = plt.figure()
    plt.pie(
        num_lines_per_wg.values(),
        radius=1,
        labels=labels,
        wedgeprops=dict(width=0.4, edgecolor="w"))
    plt.title(f"Number of {process} lines per WG")
    plt.savefig("tmp/Output/lines_per_wg.png", format="png")
    plt.close(fig)

    labels = ["%s" % k for k in num_lines_per_wg.keys()]
    label_ind = {labels[i]: i for i in range(len(labels))}
    label_ind = {
        k: v
        for k, v in sorted(label_ind.items(), key=lambda x: x[0])
    }
    if "Other" in label_ind.keys():
        other_ind = label_ind["Other"]
        label_ind.pop("Other")
        label_ind["Other"] = other_ind
    new_order_for_label = list(label_ind.values())
    # Make a stacked histogram of line rates
    fig = plt.figure()
    plt.hist(
        list(rates_per_wg.values()),
        100,
        range=(0, 1000),
        label=labels,
        stacked=True)
    plt.xlabel("Rate [Hz]")
    plt.ylabel("Number of lines")
    plt.title("Rate of %s lines, total rate: %.2f kHz" % (process, tot_rate))
    handles, _ = plt.gca().get_legend_handles_labels()
    plt.legend([handles[i] for i in new_order_for_label],
               [labels[i] for i in new_order_for_label],
               loc="upper right")
    plt.savefig("tmp/Output/hist_rate.png", format="png")
    plt.close(fig)

    # Make two stacked histograms of event sizes
    fig = plt.figure()
    plt.hist(
        list(dst_size_per_wg.values()),
        100,
        range=(0, 500 if process == 'Hlt2' else 1000),
    plt.xlabel("Event size [kB]")
    plt.ylabel("Number of lines")
    plt.title(f"DstData RawBank size of {process} lines")
    handles, _ = plt.gca().get_legend_handles_labels()
    plt.legend([handles[i] for i in new_order_for_label],
               [labels[i] for i in new_order_for_label],
               loc="upper right")
    plt.savefig("tmp/Output/hist_dst_size.png", format="png")
    plt.close(fig)

    fig = plt.figure()
    plt.hist(
        list(tot_size_per_wg.values()),
        100,
        range=(0, 500 if process == 'Hlt2' else 1000),
    plt.xlabel("Event size [kB]")
    plt.ylabel("Number of lines")
    plt.title(f"Total event size of {process} lines")
    handles, _ = plt.gca().get_legend_handles_labels()
    plt.legend([handles[i] for i in new_order_for_label],
               [labels[i] for i in new_order_for_label],
               loc="upper right")
    plt.savefig("tmp/Output/hist_tot_size.png", format="png")
    plt.close(fig)

    # Make two stacked histograms of bandwidth
    fig = plt.figure()
    plt.hist(
        list(dst_bandwidth_per_wg.values()),
        100,
        range=(0, 200 if process == 'Hlt2' else 2000),
        label=labels,
        stacked=True)
    plt.xlabel("Bandwidth from DstData size [MB/s]")
    plt.ylabel("Number of lines")
    plt.title(f"Bandwidth from DstData RawBank of {process} lines")
    handles, _ = plt.gca().get_legend_handles_labels()
    plt.legend([handles[i] for i in new_order_for_label],
               [labels[i] for i in new_order_for_label],
               loc="upper right")
    plt.savefig("tmp/Output/hist_dst_bandwidth.png", format="png")
    plt.close(fig)

    fig = plt.figure()
    plt.hist(
        list(tot_bandwidth_per_wg.values()),
        100,
        range=(0, 200 if process == 'Hlt2' else 2000),
        label=labels,
        stacked=True)
    plt.xlabel("Bandwidth from total event size [MB/s]")
    plt.ylabel("Number of lines")
    plt.title(
        f"Bandwidth from total event size of {process} lines. Total bandwidth: {tot_bandwidth:.2f} GB/s"
    )
    handles, _ = plt.gca().get_legend_handles_labels()
    plt.legend([handles[i] for i in new_order_for_label],
               [labels[i] for i in new_order_for_label],
               loc="upper right")
    plt.savefig("tmp/Output/hist_tot_bandwidth.png", format="png")
    plt.close(fig)

    wg_list = list(label_ind.keys())
    for wg_name in wg_list:
        make_plots_per_wg(
            wg_name,
            rates_per_wg[wg_name],
            dst_size_per_wg[wg_name],
            tot_size_per_wg[wg_name],
            dst_bandwidth_per_wg[wg_name],
            tot_bandwidth_per_wg[wg_name],
            process=process)

    return wg_list, list_other_lines


def make_other_line_table(name_list):
    table_html_str = r'''<table border = "1">
    <tr>
        <th> Name </th>
    </tr>'''
    for name in name_list:
        table_html_str += '''
    <tr>
        <td> %s </td>
    </tr>''' % name
    table_html_str += '\n</table>'
    return table_html_str


def make_plots_per_wg_list(wg_list):
    list_html_str = ''
    for wg_name in wg_list:
        list_html_str += f'''
        <p>
            Plots of {wg_name} group:
        </p>
        <object type="image/png" data="hist_rate_{wg_name}.png"></object>
        <object type="image/png" data="hist_dst_size_{wg_name}.png"></object>
        <object type="image/png" data="hist_tot_size_{wg_name}.png"></object>
        <object type="image/png" data="hist_dst_bandwidth_{wg_name}.png"></object>
        <object type="image/png" data="hist_tot_bandwidth_{wg_name}.png"></object>
        '''
    return list_html_str


if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='make_bandwidth_test_page')
    parser.add_argument(
        type=str,
        default=None,
        help='File path to read and save output files')
    parser.add_argument(
        '--process',
        type=str,
        choices=['Hlt2', 'Sprucing'],
        required=True,
        help='Which stage was the test run on')
    parser.add_argument(
        '-r',
        '--rate',
        default=500,  # kHz
        type=float,
        help='Input rate corresponding to the input file in kHz')
    args = parser.parse_args()

    # Read info of all lines
    df = pd.read_csv(f'{args.input}/rates-for-all-lines.csv', sep=',')
    number_of_lines = len(df)
    rate_dict = {
        df['Line'][i]: df['Rate (kHz)'][i] * 1000
        for i in range(number_of_lines)
    }
    evt_size_dict = {
        df['Line'][i]: (df['Avg DstData Size (kB)'][i],
                        df['Avg Total Event Size (kB)'][i])
        for i in range(number_of_lines)
        if df['Avg Total Event Size (kB)'][i] != 0
    }
    bandwidth_dict = {
        df['Line'][i]: (
            df['DstData Bandwidth (GB/s)'][i] * 1000.,  # in MB/s
            df['Total Bandwidth (GB/s)'][i] * 1000.)  # in MB/s
        for i in range(number_of_lines)
        if df['Avg Total Event Size (kB)'][i] != 0
    }

    # Prepare messages to GitLab
    # limits on rate: 1 kHz for Hlt2 rate and 0.5% for Sprucing retention
    tol = 1000 if args.process == 'Hlt2' else 500
    n_low_rate = len({k: v for k, v in rate_dict.items() if v == 0})
    n_high_rate = len({k: v for k, v in rate_dict.items() if v > tol})
    # Read info from default config:
    # Hlt2: 5-stream-config
    # Spruce: wg-stream-config
    if args.process == 'Hlt2':
        df = pd.read_csv(
            f'{args.input}/rates-production-stream-configuration.csv')
    elif args.process == 'Sprucing':
        df = pd.read_csv(f'{args.input}/rates-wg-stream-configuration.csv')
    tot_rate = sum(df['Rate (kHz)'])
    tot_bandwidth = sum(df['Total Bandwidth (GB/s)'])

    # Make plots & tables
    wg_list, other_line_list = make_plots(
        rate_dict,
        evt_size_dict,
        bandwidth_dict,
        tot_rate=tot_rate,
        tot_bandwidth=tot_bandwidth,
        process=args.process)

    other_line_table = make_other_line_table(other_line_list)
    plots_per_wg = make_plots_per_wg_list(wg_list)

    if args.process == 'Hlt2':
        with open(f"{args.input}/rates-production-stream-configuration.html",
                  "r") as rate_html:
            table_5stream_rates = rate_html.read()
        hlt2_or_spruce_template = HLT2_REPORT_TEMPLATE.render(
            WWW_BASE_URL=WWW_BASE_URL, table_5stream_rates=table_5stream_rates)
        input_rate_sentence = f"The input rate to this job was {args.rate} kHz (output rate of Hlt1)."
    elif args.process == 'Sprucing':
        with open(f"{args.input}/rates-wg-stream-configuration.html",
                  "r") as rate_html:
            table_wgstream_rates = rate_html.read()
        hlt2_or_spruce_template = SPRUCE_REPORT_TEMPLATE.render(
            WWW_BASE_URL=WWW_BASE_URL,
            table_wgstream_rates=table_wgstream_rates)
        input_rate_sentence = f"The input rate to this job was {args.rate} kHz (output rate of Hlt2)."

    with open(f"{args.input}/index.html", "w") as html_file:
        html = REPORT_TEMPLATE.render(
            WWW_BASE_URL=WWW_BASE_URL,
            HLT2_OR_SPRUCE_TEMPLATE=hlt2_or_spruce_template,
            INPUT_RATE_SENTENCE=input_rate_sentence)
        html_file.write(html)
    with open(f"{args.input}/other_lines.html", "w") as html_file:
        html = TABLE_OTHER_LINE_TEMPLATE.render(
            table_other_lines=other_line_table)
    with open(f"{args.input}/plots_per_wg.html", "w") as html_file:
        html = PLOTS_PER_WG_TEMPLATE.render(plots_per_wg=plots_per_wg)
        html_file.write(html)

    with open(f"{args.input}/all_rates.html", "w") as html_file:
        html = ALL_RATE_TEMPLATE.render(WWW_BASE_URL=WWW_BASE_URL)
        html_file.write(html)
        with open(f"{args.input}/rates-for-all-lines.html", "r") as rate_table:
            html_file.write(rate_table.read())

    with open(f"{args.input}/similarities_jaccards.html", "w") as html_file:
        html = f"""
            <p>
               The similarity Jaccard of all lines (under streamless configuration) is saved to a csv file: <a href="{WWW_BASE_URL}/$$dirname$$/similarity-jaccard-all.csv">similarity-jaccard-all.csv</a>
            </p>"""
        html_file.write(html)
        if args.process == 'Hlt2':
            html_file.write("""
                <p>
                    The similarity Jaccard of production-stream configuration is:
            with open(f"{args.input}/production-similarities-jaccard.html",
                      "r") as jaccard:
                html_file.write(jaccard.read())
            html_file.write("""
                <p>
                   The similarity Jaccard of working-group-stream configuration is:
            with open(f"{args.input}/wg-similarities-jaccard.html",
                      "r") as jaccard:
                html_file.write(jaccard.read())
        elif args.process == 'Sprucing':
            html_file.write("""
                <p>
                    The similarity Jaccard of wg-stream configuration is:
                </p>
                """)
            with open(f"{args.input}/wg-similarities-jaccard.html",
                      "r") as jaccard:
                html_file.write(jaccard.read())
    with open(f"{args.input}/rates_streaming.html", "w") as html_file:
        if args.process == 'Hlt2':
            html_file.write("""
                <p>
                   The rates, event sizes and bandwidths of production-stream configuration are:
            with open(
                    f"{args.input}/rates-production-stream-configuration.html",
                    "r") as rate_html:
                html_file.write(rate_html.read())
            html_file.write("""
                <p>
                   The rates, event sizes and bandwidths of working-group-stream configuration are:
            with open(f"{args.input}/rates-wg-stream-configuration.html",
                      "r") as rate_html:
                html_file.write(rate_html.read())
        elif args.process == 'Sprucing':
            html_file.write("""
                <p>
                   The rates, event sizes and bandwidths of wg-stream configuration are:
                </p>
                """)
            with open(f"{args.input}/rates-wg-stream-configuration.html",
                      "r") as rate_html:
                html_file.write(rate_html.read())
        with open(f"{args.input}/message.txt", "w") as message:
            message.write(f'total_rate = {tot_rate:.2f} kHz\n')
            message.write(f'total_bandwidth = {tot_bandwidth:.2f} GB/s\n')
            message.write(f'n_low_rate = {n_low_rate:d}\n')
            message.write(f'n_high_rate = {n_high_rate:d}\n')
            pass