DNS HTTPS Measurement

[Artifact] Server-side DNS HTTPS dataset

We provide DNS HTTPS record datasets collected through daily scans of the Tranco top 1 million domains. Specifically, we offer the following resources:

  • Dataset: Our parsed dataset includes DNS HTTPS, A, AAAA, NS, SOA records, and RRSIG of HTTPS records (where available) for the Tranco top 1 million domains, collected daily from our measurement server. This dataset will be updated monthly. Details about the dataset can be found in the Dataset section below.

  • Code: We provide the code used to generate the graphs in our paper, serving as a starting point for creating graphs using our datasets. We also offer the script used to collect DNS record data for Tranco domains.

Dataset

We provide parsed data that has been pre-processed to include only the necessary information for analysis. Each day, four CSV files are generated — two for apex domains and two for www subdomains.

  • DNS records data (apex_https.csv and www_https.csv): This data includes the following DNS record types for each of the Tranco top 1 million domains (if we are unable to retrieve responses for certain DNS records, those entries are left empty). For domains with a CNAME, we follow the CNAME and resolve DNS records for the target.

    • DNS record types
      • HTTPS (and the corresponding RRSIG, if available)
      • NS
      • A
      • AAAA
      • SOA
  • DNS flags data (apex_flags.csv and www_flags.csv): This data includes the flags returned in the response to the HTTPS record request. The following flags are provided as boolean values:

    • Flags: AD, QR, RD, RA, CD, AA, TC

Each dataset is provided on a monthly basis, with each monthly dataset further divided into daily data. We have been performing daily scans for HTTPS records since May 2023, with updates available monthly.

Download link (click)

*File format: tar.gz

Date (YYYY-MM) Download Misc.
2024-10 link
2024-09 link
2024-08 link
2024-07 link
2024-06 link
2024-05 link
2024-04 link
2024-03 link
2024-02 link
2024-01 link
2023-12 link
2023-11 link
2023-10 link
2023-09 link
2023-08 link
2023-07 link
2023-06 link
2023-05 link

Code

We provide code that can be used to generate the graphs in our paper, using the parsed data above as input. These scripts can serve as starting points for your own analysis. For those who only wish to reproduce the graphs from the paper, we also provide processed data that can be directly used by plotting scripts. Additionally, we provide code for querying DNS records for Tranco domains.

- Generating graphs in the paper

Working with the parsed dataset

After downloading the code (e.g., cloning the GitHub repo) and dataset, place the dataset in the data/parsed/ directory. Please refer to the instructions in the README.md.

Using the plotting data

Using the plotting data in the data/plotting/ directory, you can easily reproduce the graphs from the paper. No additional data downloads are needed - simply use this data and open the Jupyter Notebook files (in the notebooks/ directory) to generate the graphs presented in the paper.

GitHub repository

You can download the code here - Link

- Collecting DNS data from Tranco domains

We provide scripts to send DNS queries to Tranco domains and collect responses. Additionally, we offer code to test TLS connection establishment for domains (e.g., establishing TLS connections to domains with mismatched IP addresses in HTTPS records). Please refer to README.md for further instructions.

GitHub repository

You can download the code here - Link