BBJ data directory: \gpfs/data/im-lab/nas40t2/Data/BBJ

I first downloaded and decrypted Biobank Japan data (instructions), then organized into subdirectories BBJ-genotypes-decrypted and BBJ-phenotypes-decrypted, in their original form.


BBJ phenotypes file: gpfs/data/im-lab/nas40t2/Data/BBJ/BBJ-phenotypes.csv

The original BBJ phenotype data, in BBJ-phenotypes-decrypted, was structured so that individual data for each phenotype was in a different folder. For convenience, I combined all the phenotypes in one table, BBJ-phenotypes.csv. The file BBJ-phenotype-list.txt contains all the phenotypes and their folder names (download)

I created the combined phenotype file with the following script:

python3 --BBJ_folder /Users/sabrinami/BBJ/BBJ-phenotypes \
--phenotype_mapping /Users/sabrinami/Github/analysis-sabrina/BBJ-data-processing/BBJ-phenotype-list.txt \
--output /Users/sabrinami/Github/analysis-sabrina/BBJ-data-processing/BBJ-phenotypes.csv


Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The source code is licensed under MIT.

Suggest changes

If you find any mistakes (including typos) or want to suggest changes, please feel free to edit the source file of this page on Github and create a pull request.


For attribution, please cite this work as

Sabrina Mi (2022). Biobank Japan Data in CRI. ImLab Notes. /post/2022/01/02/biobank-japan-data-in-cri/

BibTeX citation

  title = "Biobank Japan Data in CRI",
  author = "Sabrina Mi",
  year = "2022",
  journal = "ImLab Notes",
  note = "/post/2022/01/02/biobank-japan-data-in-cri/"