HakyImLab Notes
  • Miscel
Categories
All (101)
analysis (5)
bios25328 (1)
cheatsheet (2)
cri (1)
FAQ (1)
gardner (1)
how to (8)
how_to (40)
how-to (1)
installlation (3)
lab (1)
news (1)
test (1)
vignette (1)

Public Notes

Now generated with quarto. Add or edit posts on github here

 

How to request an ALCF reservation

how_to
This is a guide on how to request reservations or score boosts for ALCF machines (Polaris, Aurora, Sophia, etc.).
Jul 3, 2025
Sofia Salazar

How to use AlphaGenome to make predictions

how_to
There is a more comprehensive set of tutorials [here] (https://www.alphagenomedocs.com/colabs/quick_start.html) that cover many usage options.
Jul 3, 2025
Temi

How to get an AlphaGenome API key

how_to
For the time being, Google requires a user to have an API key to access the AlphaGenome model. To get this key, follow the steps below.
Jul 3, 2025
Temi

compare ratxcan with and without loco

compare ratxcan mixed effects with and without loco

Aug 29, 2024
Haky Im
 

how to apply to F30/31 grants

how-to

material collected by grad students at UChicago

Jul 11, 2024
Jennifer Blanc, Eric McIntire
 

recount quick access

how to quickview recount

May 31, 2024
Haky Im
 

bios25328 lab 1 command line and R

bios25328
lab
We will learn to use the command line by doing. Follow the instructions below.
Mar 18, 2024
Haky Im

how to run for loops in parallel in R

In this exammple, n~2e5 is the break even point, beyond which, the overhead of parallel computation is offset by the parallel computation
Feb 16, 2024
Haky Im
 

query alpha-missense

First, ensure SQLite3 is installed. If it’s not, you can install it using Homebrew on macOS:
Oct 23, 2023
Haky Im
 

How to create a new Quarto Blog

how_to

This is a tutorial on how to create a new quarto blog using the R studio interface

Sep 15, 2023
Sofia Salazar
 

new test

TODO: delete this test post
Aug 10, 2023
Haky Im
 

testing quarto post app

test
You can install the development version of quartopost from GitHub with
Aug 10, 2023
Haky Im

Attenuation bias in PrediXcan

Uncertainty in predicted expression causes attenuation bias, and reduced significance of the association. Hence significance is underestimated.
Jul 3, 2023
Haky Im

qqunif function with filtered p-values

expected under the null is (1:nn) * maxp / (nn+1)
Jun 16, 2023
Haky Im
 

How to read gencode gtf file

analysis
how_to
to load version 75 of ensembl
Apr 27, 2023
Haky Im
 

Cistrome DB data

suppressMessages(library(tidyverse))
suppressMessages(library(glue))
PRE = "/Users/haekyungim/Library/CloudStorage/Box-Box/LargeFiles/imlab-data/data-Github/web-data"
##PRE=…
Mar 28, 2023
Haky Im

ERAP2 fine-mapping

analysis
how_to
ERAP2 fine-mapping results DAPG
Mar 28, 2023
Haky Im

Multiple Testing Vignette

vignette
We start defining some parameters for the simulations. The need for these will become obvious later.
Mar 28, 2023
Haky Im
 

Jane Austen Corpus

analysis
how_to
Show the code
suppressMessages(library(tidyverse))
suppressMessages(library(glue))
PRE = "/Users/haekyungim/Library/CloudStorage/Box-Box/LargeFiles/imlab-data/data-Github/w…
Mar 22, 2023
Haky Im
 

gtex-sample-size-by-tissue

GTEx Sample Size by Tissue
Mar 1, 2023
Haky Im
 

Migrating

how_to
Here will come the list of migrating instructions
Feb 28, 2023
Haky

PredictDB weight distribution

analysis
Goal: get effect size distribution of omic traits
Feb 28, 2023
Haky Im
 

Power calculator for mol QTLs

how to
The success of the prediction training depends mostly on whether the corresponding QTL study will be powered. Thus, we provide here the power to detect molecular QTLs for a…
Feb 21, 2023
Haky Im
 

How to create a new quarto blog post

how_to
Useful tips for new posts here
Jan 1, 2023
Haky Im
 

How to prepare effective presentation slides

how to
Citation: Naegle KM (2021) Ten simple rules for effective presentation slides. PLoS Comput Biol 17(12): e1009554. https://doi.org/10.1371/journal.pcbi.1009554
Nov 30, 2022
Haky Im
 

PARG Fall Social 2022

analysis
not very helpful, better to have list to select from rather than free text next time
Oct 20, 2022
Haky Im
 

ukbREST SQLite Setup

installlation
how to
This post explains how to 1. Query UKBREST in CRI 2. Create an SQLite database from the postgres database 3. Update the withdrawal list
Aug 28, 2022
Sabrina Mi
 

How to annotate genes - BioMart Basics

cheatsheet
BioMart is a database containing Ensembl annotations of genes across many species and builds. To query data, you first pick one the databases: 1. Ensembl Genes 2. Ensembl…
Jun 28, 2022
Sabrina Mi
 

Transcriptome QGT Lab 2022 Setup

The instructions in this blog were written to set up the lab in Rstudio cloud
Jun 14, 2022
 

Transcriptome QGT Training 2023

Notice that we may ask you to share your screen for pedagogic purposes.
Jun 10, 2022
Haky Im
 

How to query eqtl info from GTEx API (a simple way)

how to
This is a simple and convenient way to get some information about certain eqtls from GTEx API. But it may not be a good way to do big query because it’s not so fast.
May 22, 2022
Charles Zhou
 

How-to-open-jupyter-notebook-on-CRI-or-RCC

how_to
This is a workflow to show how to open jupyter notebook on CRI or RCC. There are some detailed instructions here, CRI’s instructions and More general instructions
May 17, 2022
Charles Zhou

How to calculate Z-score, P-value, Chi2 stat from GWAS

Given one of the statistics in a GWAS (Z-score, P-value, or chi2), calculate the others.
Apr 13, 2022
Haky Im

Publishing a ShinyApp

First, we need to install and load the rsconnect package.
Mar 7, 2022
Ethan Tai
 

Biobank Japan Data in CRI

BBJ data directory: \gpfs/data/im-lab/nas40t2/Data/BBJ
Jan 2, 2022
Sabrina Mi

ARIC EA hg38 validation

workflowr
Sep 8, 2021
Sabrina Mi

ARIC EA hg38 validation height

workflowr
Sep 8, 2021
Sabrina Mi
 

ARIC PWAS models

The ARIC model for MetaXcan is in Box: https://uchicago.box.com/s/3sf4y4gv6c7zam0l5fxicpcd3zji5wzc
Sep 8, 2021
Sabrina Mi
 

Covariances EA hg38

workflowr
Sep 8, 2021
Sabrina Mi
 

Weights AA

workflowr
Sep 8, 2021
Sabrina Mi
 

Weights EA

workflowr
Sep 8, 2021
Sabrina Mi
 

Japan Biobank Data Summary (closed)

We have access to dataset JGAS000114 from the Biobank Japan Project. This includes:
Aug 12, 2021
Sabrina Mi
 

Installing R packages without admin access

how to
** Note that this content was copied from the Berkely Statistics Website Original article linked here
Jul 29, 2021
Berkely Statistics

How to annotate a gene with the cytogenetic band

how to
Adding the cytogenetic to genes is convenient because it provides a somewhat memorable names for the genomic region where the gene is located. Biomart package in…
Jul 28, 2021
Haky Im
 

Tracking the traffic to webpages

how_to
To keep track of traffic to blogdown (hugo-based) webpages in google analytics, follow the steps below
Jul 26, 2021
Haky Im
 

Error: 0 % SNPs used

FAQ
Mismatch between the model SNP ids and geneotype/gwas SNP ids, e.g using model rsids to match with genotype variant_ids
Jul 20, 2021
Festus
 

Get list of predictor SNPs and weights in predictdb

how_to
To get a list of SNPs and the corresponding weights to predict expression levels (or splicing) of a given gene, you will first need to download the databases where the…
Jul 19, 2021
Haky Im
 

Download LD blocks

how to
This is a short tutorial on how to download ld block data as summarized by Berisa and Pickrell. The LD data is available in hg19 genome build and for three different…
Jul 13, 2021
Festus Nyasimi
 

Mount Gardner file stystem to your computer

how to
This page contains description on how to map CRI storage to your computer. Depending on the operating system you are using there are different approaches. Which are shown…
Jul 12, 2021
Festus Nyasimi
 

Links to How To’s

how_to
Yanyu’s notes on uploading to zenodo
Jul 12, 2021
Haky Im
 

Training Gene Expression Prediction Models

how_to
PrediXcan and TWAS methods in general correlate genetically predicted levels of gene expression traits with complex traits to understand the mechanism behind GWAS loci. A…
Jul 9, 2021
Haky Im
 

PredictDB Tutorial

This tutorial will provide the steps you need to follow to get you started training prediction models and putting them in a format that can be used to run the predixcan…
Jun 28, 2021
Festus Nyasimi
 

Github Cheatsheet

cheatsheet
Note: older repos may have master as the main branch
Jun 16, 2021
Haky Im
 

Creating a new post (deprecated)

how_to
To publish an analysis note in the notebook, you need to have blogdown and hugo installed on your computer. - install.packages('blogdown') - blogdown::install_hugo() - git…
Jun 16, 2021
Haky Im
 

Subsetting HapMap3 SNPs

how_to
This is the readme of the codes here.
Jun 14, 2021
Yanyu Liang
 

How to use CRI cluster

how_to
cri
gardner
this document needs updating. Here is a quick start guide for randi, which replaced gardner, decommissioned for a while now (2024-07-15)
Jun 10, 2021
 

Installing Conda on CRI

installlation
CRI has versions of miniconda already downloaded through the module commands. However, those sometimes do not work so it is best to just install conda for yourself on CRI.…
Jun 4, 2021
Natasha Santhnam
 

Heritability Calculation

How to calculate Heritability using GCTA
Jun 3, 2021
Natasha Santhanam
 

Querying PredictDB sqlite databases

how_to
PredictDB databases are stored in simple sqlite files. You can programmatically query them via python, R, perl, etc (using appropriate libraries). Below is an example on how…
Apr 27, 2021
Haky Im
 

Bionimbus PDC

how_to
Reference: * https://www.opensciencedatacloud.org/support/ssh.html * https://www.opensciencedatacloud.org/support/2fa.html
Mar 31, 2021
Sabrina Mi
 

SPrediXcan Harmonization Errors

The error message– INFO - 0 % of model's snps used– can typically be traced to inconsistencies between variant IDs in prediction models and input GWAS files. Our GTEx v8…
Jan 7, 2021
Sabrina Mi
 

PsychENCODE Models

Gandal et al analyzed autism spectrum disorder, schizophrenia, and bipolar disorder across multiple levels of transcriptomic organization—gene expression, local splicing…
Dec 21, 2020
Sabrina Mi
 

PrediXcan 0% variant mapping issue

Many users had difficulties matching the genotype variant id to the prediction model variant id.
Dec 1, 2020
Haky Im
 

How to interpret a p-value of 0

A p-value of zero should be interpreted as an extremely small positive value.
Dec 1, 2020
Haky Im
 

CRI Gardner upgrade news

news
Operating System Upgrade - The operating system will be upgraded from Red Hat Linux 6.7 to 7.6. This will provide a kernel that will allow for a more modern software…
Nov 30, 2020
Haky Im
 

Working with Large Files

When working with large datasets, only the files with code should be pushed to Github repositories, not the data itself. The raw data inputs or analysis output should either…
Nov 23, 2020
Sabrina Mi
 

How to convert GTEx v8 model to hg19 based on UK Biobank SNP set mapping

/gpfs/data/im-lab/nas40t2/Data/References/mappings/UKB2GTEx_mapping.txt.gz contains information for variants in UK Biobank genotypes. The columns are
Nov 23, 2020
Sabrina Mi
 

Installing/Running Tensorqtl on CRI

Note: You can download tensorqtl using pip install. However, there seems to be a bug that makes tensorqtl incompatible with pandas plink 2.2.2. If you want to download…
Nov 17, 2020
Natasha
 

Creating Environments in CRI

  • Create the new environment in the lab share
  • Nov 13, 2020
    Natasha Santhanam
     

    Installing tensorqtl module

    installlation
    Installing tensorqtl requires pytorch which is based on gpus but there is also a cpu based version.
    Nov 11, 2020
    Festus Nyasimi
     

    Querying bgen files in CRI

    A bgen file has a header block with information about the file, including number of samples, the number of variant data blocks, and flags which describe how data is stores.…
    Nov 10, 2020
    Sabrina Mi

    Calculating H2 across GTEX Brain Samples

    After receiving a .RAR file with reaction rate data for 13 brain tissue samples, we wanted to calculate h2 and see if any of the thousands of reactions were heritable.
    Nov 6, 2020
    Natasha Santhanam

    Exploration on regressing out PCs

    Some going-to-nowhere exploration on regressing out PCA

    Nov 4, 2020
    Yanyu Liang

    GTEx reaction rates h2 similar to permuted h2

    We want to investigate whether reaction rates estimated with imat(?) are heritable. Reaction rates (binary variables) were estimated by D using GTEx gene expression data.
    Nov 2, 2020
    Haky Im
     

    Configuring Windows Subsystem for Linux

    how_to
    This is a guide to configuring a Windows system to utilize many of the tools that the Im Lab uses.
    Aug 10, 2020
    Ian Waters

    gwas-catalog

    in 2020 [1] 94664 3
    Aug 2, 2020
    Hae Kyung Im
     

    How to Prepare a Post Implementation Report?

    how_to
    This is a guide on how to prepare a Post-Implementation Report for cloud credits, for future reference. This guide is based on the one prepared for the deadline of February…
    Jul 30, 2020
     

    Computing on ANL Servers

    how_to
    To get access to the servers at ANL, you will need at minimum an account with Argonne’s MCS division, as well as access to specific computing groups. This process is usually…
    Jul 30, 2020
     

    How to Use AWS

    how_to
    Stop/start your EC2 instances programatically through AWS Command Line Interface (CLI).
    Jul 30, 2020
     

    How to Submit to Zenodo

    how_to
    How to submit data to Zenodo
    Jul 30, 2020
     

    How to Use Workflowr

    how_to
    #Open RStudio
    Jul 30, 2020
     

    Downloading and Decrypting dbGaP Data

    how_to
    This page is about downloading and decrypting data from NCBI dbGaP. I will split up the instructions into Downloading and Decrypting
    Jul 30, 2020
     

    How to Configure Custom SSH Connection

    how_to
    This wiki will show you how to set up SSH keys and configure custom connection options for accessing uchicago tarbell. After successful configuration, you can login your…
    Jul 30, 2020
     

    Molecular data available in GDC

    how_to
    Jul 30, 2020
     

    How to Use RCC Cluster

    how_to
    https://rcc-uchicago.github.io/user-guide/
    Jul 30, 2020

    Downloading Data from Biobank Japan

    how_to
    The same info can be found at this page, https://www.ddbj.nig.ac.jp/jga/download-e.html, but the steps may be slightly different because we applied for Biobank Japan data…
    Jul 30, 2020
    Sabrina Mi
     

    Querying BigQuery

    how_to
    find examples here https://hakyimlab.github.io/bigquery-covid19/query-phenomexcan.html
    Jul 30, 2020
     

    Hands-On Training: plink

    how_to
    LATEST VERSION IN https://bios25328.hakyimlab.org/post/2021/04/09/plink-tutorial/
    Jul 14, 2020
     

    Using R_Markdown

    how_to
    R Markdown is a authoring framework that allows for reproducible documentation of data science within the context of R Studio. This is an introduction designed to teach you…
    Jul 8, 2020
    Ian Waters
     

    Hands-On Training: R

    how_to
    Swirl is a great and easy way to get you started with R. Install and open it by clicking the green arrow on the right.
    Jul 5, 2020
    Laura Vairus
     

    Hands-On Training: Command Line

    how_to
    In this tutorial, we will learn some basic Unix/Linux commands to perform tasks in the command line.
    The command line is an interface that allows you to store, manage, and…

    Jul 2, 2020
    Laura Vairus
     

    Converting Fusion weight to PredictDB format

    how_to
    If you want to use the MetaXcan suite of tools, you will need to format your prediction weights in sqlite format.
    Jun 22, 2020
    Sabrina Mi
     

    IntroStatGen R Studio Servers using Google Cloud

    how_to
    For the one-day seminar, we had a hands-on lab where we decided we needed to set up R Studio Servers. The servers needed pre-loaded data, access to a terminal, pre-compiled…
    Jun 16, 2020
    Owen Melia
     

    benjamin moore colors

    Read benjamin moore list of colors with HEX code from colornerd
    Jan 1, 2020
    Haky Im
     

    GWAS on ANL Servers

    how_to
    Steps to running the GWAS on ANL’s servers.
    Jul 30, 2001
     

    dbGaP Project Renewal

    how_to
    The email reminder to renew dbGaP projects should link to My Projects tab after logging in, where you can select the project to request renewal. For the most part, the…
    Jul 30, 2001
     

    How to Download the Data from the UK Biobank

    how_to
    Application 19526 is the main application from which imlab downloads data. Other ID’s correspond to specific collaborations with other investigators at Uchicago and Argonne.
    Jul 30, 2001
     

    lego database

    downloads schema in /Users/haekyungim/Library/CloudStorage/Box-Box/LargeFiles/imlab-data/data-Github/web-data/2000-01-01-lego-database/downloads_schema.png
    Jan 1, 2000
    Haky Im

    Iris prediction with neural network

    Show the code
    ## conda install pytorch torchvision torchaudio cudatoolkit=<version> -c pytorch
    
    ## (test-env) MED-ML-4210:1999-01-01-iris-dataset-analysis haekyungim$…
    Jan 1, 1999
    Haky Im

    iris dataset analysis

    Show the code
    # Load the iris dataset
    data(iris)
    
    # Perform Fisher's discriminant analysis
    library(MASS)
    lda_model <- lda(Species ~ ., data = iris)
    
    # Print the summary…
    Jan 1, 1999
    Haky Im
     

    test

    other text here
    Jan 1, 1999
    testing author
    No matching items

      Reuse

      CC BY 4.0