suppressMessages(library(tidyverse))suppressMessages(library(glue))PRE ="/Users/haekyungim/Library/CloudStorage/Box-Box/LargeFiles/imlab-data/data-Github/web-data"SLUG="query-alpha-missense"## copy the slug from the headerbDATE='2023-10-23'## copy the date from the blog's header hereDATA =glue("{PRE}/{bDATE}-{SLUG}")if(!file.exists(DATA)) system(glue::glue("mkdir {DATA}"))WORK=DATA##system(glue("open {DATA}")) ## this will open the folder
1. Create an SQLite Database
First, ensure SQLite3 is installed. If it’s not, you can install it using Homebrew on macOS:
brew install sqlite3
Now, let’s create an SQLite database from the TSV file:
##install.packages("readr")library(RSQLite)library(readr)# Connect to the SQLite databasecon <-dbConnect(SQLite(), glue("{DATA}/test-alpha-missense-query/AlphaMissense.db"))# Read the gzipped TSV file# data <- read_tsv(glue("{DATA}/AlphaMissense_aa_substitutions.tsv.gz"),skip=3)# head(data)# Write the data into the SQLite database# tic=Sys.time()# dbWriteTable(con, "mutations", data)# toc=Sys.time()# print(toc-tic)dbExecute(con, "VACUUM")# Close the connectiondbDisconnect(con)
Note: The provided command assumes that the TSV file doesn’t contain header lines. If it does, you should create a version of the TSV file without them, or delete the imported header row from the SQLite table afterwards.
2. Create a Shiny App
Make sure you have R, Shiny, and the required packages installed:
This app provides two input fields: one for the uniprot_id and another for the protein_variant. If the user fills out the uniprot_id and clicks “Query”, all matching mutations for that UniProt ID will be displayed. If they also provide a protein variant, the app will display only the matching mutation.
To run this app, save the code into a file, say app.R, then from the R console (or RStudio), navigate to the directory containing both the SQLite database (AlphaMissense.db) and the app (app.R), and run:
The error suggests that the deployed Shiny app is not able to authenticate with Google BigQuery because it does not have the required credentials. When you run the Shiny app locally on your machine, it might be using your personal Google credentials. However, when the app is deployed on shinyapps.io, it does not have access to these credentials.
To resolve this:
Service Account:
You’ll need to create a service account in the Google Cloud Console that has access to the BigQuery dataset.
Download the JSON key for this service account.
Include Service Account in Shiny App:
Save the JSON key in the same directory as your Shiny app.
In your Shiny app, before making any BigQuery calls, authenticate using this service account.
library(bigrquery)# Authenticate using Service Accountbq_auth(path ="path_to_service_account.json")
Deploy the Shiny App:
Make sure you deploy both the Shiny app files (app.R, etc.) and the service account JSON key to shinyapps.io.
Permissions:
Ensure the service account has the necessary permissions on BigQuery to read the data.
Billing:
If your Google Cloud Project uses billing (as BigQuery is a paid service), make sure the service account has permissions to incur charges.
Avoid Hardcoding Paths:
Instead of hardcoding the path to the service account file, you might want to use a more dynamic approach, for instance:
# Authenticate using Service Accountservice_account_path <-Sys.getenv("SERVICE_ACCOUNT_PATH", "default_service_account.json")bq_auth(path = service_account_path)
You can then set the SERVICE_ACCOUNT_PATH environment variable appropriately, either locally or on shinyapps.io.
After making these adjustments, try redeploying your Shiny app.
Code
library(biomaRt)# Choose the ENSEMBL mart and dataset for human genesensembl <-useMart("ensembl", dataset="hsapiens_gene_ensembl")# Fetch the required attributes: HUGO name and UniProt IDmapping <-getBM(attributes=c('hgnc_symbol', 'uniprotswissprot'), mart=ensembl)# Filter out entries with empty UniProt IDs or HUGO namesmapping <- mapping[mapping$hgnc_symbol !=''& mapping$uniprotswissprot !='', ]write.csv(mapping, "hugo_to_uniprot_mapping.csv", row.names=FALSE)
---title: query alpha-missense date: 2023-10-23author: Haky Imeditor_options: chunk_output_type: console---```{r eval=FALSE}suppressMessages(library(tidyverse))suppressMessages(library(glue))PRE ="/Users/haekyungim/Library/CloudStorage/Box-Box/LargeFiles/imlab-data/data-Github/web-data"SLUG="query-alpha-missense"## copy the slug from the headerbDATE='2023-10-23'## copy the date from the blog's header hereDATA =glue("{PRE}/{bDATE}-{SLUG}")if(!file.exists(DATA)) system(glue::glue("mkdir {DATA}"))WORK=DATA##system(glue("open {DATA}")) ## this will open the folder ```### 1. Create an SQLite DatabaseFirst, ensure SQLite3 is installed. If it's not, you can install it using Homebrew on macOS:```bashbrew install sqlite3```Now, let's create an SQLite database from the TSV file:```bash## this needs uncompressed filesqlite3 AlphaMissense.db <<EOFCREATE TABLE mutations (uniprot_id TEXT, protein_variant TEXT, am_pathogenicity REAL, am_class TEXT);.mode tabs.import AlphaMissense_aa_substitutions.tsv.gz mutationsEOF``````{r eval=FALSE}##install.packages("readr")library(RSQLite)library(readr)# Connect to the SQLite databasecon <-dbConnect(SQLite(), glue("{DATA}/test-alpha-missense-query/AlphaMissense.db"))# Read the gzipped TSV file# data <- read_tsv(glue("{DATA}/AlphaMissense_aa_substitutions.tsv.gz"),skip=3)# head(data)# Write the data into the SQLite database# tic=Sys.time()# dbWriteTable(con, "mutations", data)# toc=Sys.time()# print(toc-tic)dbExecute(con, "VACUUM")# Close the connectiondbDisconnect(con)```Note: The provided command assumes that the TSV file doesn't contain header lines. If it does, you should create a version of the TSV file without them, or delete the imported header row from the SQLite table afterwards.### 2. Create a Shiny AppMake sure you have R, Shiny, and the required packages installed:```{r eval=FALSE}install.packages("shiny")install.packages("RSQLite")install.packages("DT")```Here's a simple Shiny app:```{r eval=FALSE}library(shiny)library(RSQLite)library(DT)# Connect to the SQLite databasecon <-dbConnect(SQLite(), "AlphaMissense.db")ui <-fluidPage(titlePanel("AlphaMissense Query"),sidebarLayout(sidebarPanel(textInput("uniprot_id", "UniProt ID", ""),textInput("protein_variant", "Protein Variant", ""),actionButton("query_button", "Query") ),mainPanel(DTOutput("results_table") ) ))server <-function(input, output) {observeEvent(input$query_button, { query <-"SELECT * FROM mutations WHERE 1=1 "if (input$uniprot_id !="") { query <-paste0(query, "AND uniprot_id = '", input$uniprot_id, "' ") }if (input$protein_variant !="") { query <-paste0(query, "AND protein_variant = '", input$protein_variant, "'") } results <-dbGetQuery(con, query) output$results_table <-renderDT({results}) })}shinyApp(ui = ui, server = server)```This app provides two input fields: one for the `uniprot_id` and another for the `protein_variant`. If the user fills out the `uniprot_id` and clicks "Query", all matching mutations for that UniProt ID will be displayed. If they also provide a protein variant, the app will display only the matching mutation.To run this app, save the code into a file, say `app.R`, then from the R console (or RStudio), navigate to the directory containing both the SQLite database (`AlphaMissense.db`) and the app (`app.R`), and run:```{r eval=FALSE}library(shiny)runApp(glue("{DATA}/app.R"))``````{r eval=FALSE}install.packages("bigrquery")library(bigrquery)``````{r eval=FALSE}runApp(glue("~/Downloads/alphamissense-query/app.R"))``````{r eval=FALSE}library(shiny)runApp(glue("~/Downloads/alphamissense-query-hugo/app.R"))```The error suggests that the deployed Shiny app is not able to authenticate with Google BigQuery because it does not have the required credentials. When you run the Shiny app locally on your machine, it might be using your personal Google credentials. However, when the app is deployed on `shinyapps.io`, it does not have access to these credentials.To resolve this:1. **Service Account**: - You'll need to create a service account in the Google Cloud Console that has access to the BigQuery dataset. - Download the JSON key for this service account.2. **Include Service Account in Shiny App**: - Save the JSON key in the same directory as your Shiny app. - In your Shiny app, before making any BigQuery calls, authenticate using this service account.```Rlibrary(bigrquery)# Authenticate using Service Accountbq_auth(path ="path_to_service_account.json")```3. **Deploy the Shiny App**: - Make sure you deploy both the Shiny app files (`app.R`, etc.) and the service account JSON key to `shinyapps.io`.4. **Permissions**: - Ensure the service account has the necessary permissions on BigQuery to read the data.5. **Billing**: - If your Google Cloud Project uses billing (as BigQuery is a paid service), make sure the service account has permissions to incur charges.6. **Avoid Hardcoding Paths**: - Instead of hardcoding the path to the service account file, you might want to use a more dynamic approach, for instance:```R# Authenticate using Service Account service_account_path <-Sys.getenv("SERVICE_ACCOUNT_PATH", "default_service_account.json")bq_auth(path = service_account_path)``` You can then set the `SERVICE_ACCOUNT_PATH` environment variable appropriately, either locally or on `shinyapps.io`.After making these adjustments, try redeploying your Shiny app.```{r eval=FALSE}library(biomaRt)# Choose the ENSEMBL mart and dataset for human genesensembl <-useMart("ensembl", dataset="hsapiens_gene_ensembl")# Fetch the required attributes: HUGO name and UniProt IDmapping <-getBM(attributes=c('hgnc_symbol', 'uniprotswissprot'), mart=ensembl)# Filter out entries with empty UniProt IDs or HUGO namesmapping <- mapping[mapping$hgnc_symbol !=''& mapping$uniprotswissprot !='', ]write.csv(mapping, "hugo_to_uniprot_mapping.csv", row.names=FALSE)```