Website Name: Centene Corporation This folder has the data for the link: https://www.centene.com/price-transparency-files.html There are 6 json files each put in a folder1-6 and the details of each are given below ---------------------------------------------Detials---------------------------------------------------------------------- Folder1 reporting_entity_type: Third Party Administrator reporting_entity_name: Centene Management Company LLC 27 links for json.gz were found in ./2022-12-01_ambetter_index.json As these links are all not have a .json.gz ending, instead some links leads to other webpages with nothing to download, so we will filter the links from which we can get the data. All the downloadable links are also written to this file: 2022-12-01_ambetter_index.txt All the downloaded file are extracted from .json.gz to .json and then converted into .csv. After conversion the .json files are deleted to save memory. --------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------Detials---------------------------------------------------------------------- Folder2 reporting_entity_type: Third Party Administrator reporting_entity_name: Centene Management Company LLC 2 links for json.gz were found in ./2022-12-01_fidelis_index.json As these links are all not have a .json.gz ending, instead some links leads to other webpages with nothing to download, so we will filter the links from which we can get the data. All the downloadable links are also written to this file: 2022-12-01_fidelis_index.txt All the downloaded file are extracted from .json.gz to .json and then converted into .csv. After conversion the .json files are deleted to save memory. --------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------Detials---------------------------------------------------------------------- Folder3 reporting_entity_type: Third Party Administrator reporting_entity_name: Centene Management Company LLC 2 links for json.gz were found in ./2022-12-01_healthnet_index.json As these links are all not have a .json.gz ending, instead some links leads to other webpages with nothing to download, so we will filter the links from which we can get the data. All the downloadable links are also written to this file: 2022-12-01_healthnet_index.txt All the downloaded file are extracted from .json.gz to .json and then converted into .csv. After conversion the .json files are deleted to save memory. --------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------Detials---------------------------------------------------------------------- Folder4 reporting_entity_type: Third Party Administrator reporting_entity_name: Centene Management Company LLC 1 links for json.gz were found in ./2022-12-01_mhn_index.json As these links are all not have a .json.gz ending, instead some links leads to other webpages with nothing to download, so we will filter the links from which we can get the data. All the downloadable links are also written to this file: 2022-12-01_mhn_index.txt All the downloaded file are extracted from .json.gz to .json and then converted into .csv. After conversion the .json files are deleted to save memory. --------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------Detials---------------------------------------------------------------------- Folder5 reporting_entity_type: Third Party Administrator reporting_entity_name: Centene Management Company LLC 1 links for json.gz were found in ./2022-12-01_qualchoice_index.json As these links are all not have a .json.gz ending, instead some links leads to other webpages with nothing to download, so we will filter the links from which we can get the data. All the downloadable links are also written to this file: 2022-12-01_qualchoice_index.txt All the downloaded file are extracted from .json.gz to .json and then converted into .csv. After conversion the .json files are deleted to save memory. --------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------Detials---------------------------------------------------------------------- Folder6 reporting_entity_type: Third Party Administrator reporting_entity_name: Centene Management Company LLC 1 links for json.gz were found in ./2022-12-01_wellcarenc_index.json As these links are all not have a .json.gz ending, instead some links leads to other webpages with nothing to download, so we will filter the links from which we can get the data. All the downloadable links are also written to this file: 2022-12-01_wellcarenc_index.txt All the downloaded file are extracted from .json.gz to .json and then converted into .csv. After conversion the .json files are deleted to save memory. --------------------------------------------------------------------------------------------------------------------------- The .csv files are the actual data ! for the codes either check out the main.py or final_script.ipynb