BBJ data directory:
I first downloaded and decrypted Biobank Japan data (instructions), then organized into subdirectories
BBJ-phenotypes-decrypted, in their original form.
BBJ phenotypes file:
This CSV combines all phenotype data in the
BBJ-phenotypes-decrypted subdirectory into one file.
The original BBJ phenotype data in
BBJ-phenotypes-decrypted, was bulky and used dataset IDs instead of phenotype names. The file
BBJ-phenotype-list.txt contains all the phenotypes and their folder names (download)
I created the combined phenotype file with the following script:
python3 process-phenotypes.py --BBJ_folder /Users/sabrinami/BBJ/BBJ-phenotypes \ --phenotype_mapping /Users/sabrinami/Github/analysis-sabrina/BBJ-data-processing/BBJ-phenotype-list.txt \ --output /Users/sabrinami/Github/analysis-sabrina/BBJ-data-processing/BBJ-phenotypes.csv
BBJ genotypes folder: