BBJ data directory: \gpfs/data/im-lab/nas40t2/Data/BBJ

I first downloaded and decrypted Biobank Japan data (instructions), then organized into subdirectories BBJ-genotypes-decrypted and BBJ-phenotypes-decrypted, in their original form.

Phenotypes

BBJ phenotypes file: gpfs/data/im-lab/nas40t2/Data/BBJ/BBJ-phenotypes.csv

The original BBJ phenotype data, in BBJ-phenotypes-decrypted, was structured so that individual data for each phenotype was in a different folder. For convenience, I combined all the phenotypes in one table, BBJ-phenotypes.csv. The file BBJ-phenotype-list.txt contains all the phenotypes and their folder names (download)

I created the combined phenotype file with the following script:

python3 process-phenotypes.py --BBJ_folder /Users/sabrinami/BBJ/BBJ-phenotypes \
--phenotype_mapping /Users/sabrinami/Github/analysis-sabrina/BBJ-data-processing/BBJ-phenotype-list.txt \
--output /Users/sabrinami/Github/analysis-sabrina/BBJ-data-processing/BBJ-phenotypes.csv

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The source code is licensed under MIT.

Suggest changes

If you find any mistakes (including typos) or want to suggest changes, please feel free to edit the source file of this page on Github and create a pull request.

Citation

For attribution, please cite this work as

Sabrina Mi (2022). Biobank Japan Data in CRI. ImLab Notes. /post/2022/01/02/biobank-japan-data-in-cri/

BibTeX citation

@misc{
  title = "Biobank Japan Data in CRI",
  author = "Sabrina Mi",
  year = "2022",
  journal = "ImLab Notes",
  note = "/post/2022/01/02/biobank-japan-data-in-cri/"
}