This document accompanies the slides from BaRC’s Introduction to R and shows the use of some simple commands. See the accompanying slides and data files (and material from other Hot Topics talks) at http://barc.wi.mit.edu/hot_topics/ (when you click on “Statistics”). All of the commands below should work similarly if run on Windows/Mac/Linux (using the typical R installation or RStudio).
Our example data is the number of tumors in wild-type and knockout mice, each assayed in triplicate. For variable names, we use one-word (no spaces, anyway) names that can include any combination of lowercase and uppercase letters and numbers (although they must begin with a letter). It can be helpful to include dots to make the variables easier to read, like MGF.ko.trial.1 R will convert many other special characters (even dashes and underscores) into dots.
After starting your R session, save this file in your working directory.
##Part I: Getting familiar with R
###1 Entering data by hand, The c() (combine) command is used to concatenate several pieces of data into a vector. These can be numbers or text (but the latter must be surrounded by quotes).
# These are the tumor counts for the WT animals.
wt = c(5, 6, 7)
# These are the tumor counts for the KO animals.
ko = c(8, 9, 11)
wt
## [1] 5 6 7
ko
## [1] 8 9 11
Note that typing the variable by itself prints the values. The [1] at the beginning of each output row shows that the first value in the row is the first entry in the vector (list). This becomes useful if we have a vector containing more values that can be printed on one line.
We’ve also included some comments. Whenever R sees a pound sign #, it ignores the rest of the line. A liberal use of comments is very helpful to remind you or show others what each command does and perhaps why you chose the specific options you did.
###2 Running a statistical test
t.test(wt, ko)
##
## Welch Two Sample t-test
##
## data: wt and ko
## t = -3.1623, df = 3.4483, p-value = 0.04191
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.4542173 -0.2124494
## sample estimates:
## mean of x mean of y
## 6.000000 9.333333
Unlike Excel, the output for a statistical test provides more information than just the p-value. But what if we don’t want to use the default t-test options. How can we find out how to change the default settings? Try entering the command, preceded by a question mark, as in ?t.test
Running this command on a Windows or Mac computer will usually open a documentation page in your web browser, which can be easier to read than the text-only Unix way of printing documentation.
We can also save the results of a statistical analysis as another variable which will then contain all the information that you saw printed to the screen (and sometimes a lot more). For example, let’s run a standard (i.e. Student’s) t-test (note the var.equal=T, which calculates equal variances for each variable).
wt.vs.ko = t.test(wt, ko, var.equal=T)
wt.vs.ko
##
## Two Sample t-test
##
## data: wt and ko
## t = -3.1623, df = 4, p-value = 0.03411
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.2599634 -0.4067032
## sample estimates:
## mean of x mean of y
## 6.000000 9.333333
If we want to print or save just part of this output, we need to know how to access its different parts. Try the names() command to see what each part is called.
names(wt.vs.ko)
## [1] "statistic" "parameter" "p.value" "conf.int" "estimate"
## [6] "null.value" "stderr" "alternative" "method" "data.name"
Now we can access each part of this variable using the syntax VARIABLE$PART, such as
wt.vs.ko$p.value
## [1] 0.03410942
wt.vs.ko$conf.int
## [1] -6.2599634 -0.4067032
## attr(,"conf.level")
## [1] 0.95
What if we want to get the commands we used, to put them into a script to document our analyis pipeline? Use the “history” command, as in history(max.show=Inf).
###3 Reading data from a file To get started you probably want to get R and your data in the same place. To get R’s working directory, try
getwd()
## [1] "/nfs/BaRC_training/Intro_to_R"
To change the working directory on your computer, go to File => Change dir… menu. On tak (or you can do it this way on your computer, too), use the setwd() command, like setwd(“/lab/solexa_public/Graceland_Lab/Elvis”) We can see what files we have in our directory. (You won’t have all of these.)
dir()
## [1] "Intro_R.html" "Intro_R.Rmd" "tumor_boxplot.pdf"
## [4] "tumor_boxplot.png" "tumor_pvals.txt"
Let’s assume that we have a tab-delimited text file of our dataset (“tumors wt ko.txt”), with one column for wt and another for ko, with column labels in the first row. One way to read the file is with the read.delim() command, Note that we’ve included header=T to indicate that the first row has our column names. If we use header=F then the columns will just be numbered (and the data will include the first row of the file).
tumors = read.delim("http://barc.wi.mit.edu/education/hot_topics/Intro_to_R_2015/tumors_wt_ko.txt", header=T)
tumors
## wt ko
## 1 5 8
## 2 6 9
## 3 7 11
Our data looks OK: we have one (labeled) column for each group of mice.
###4 Accessing a data matrix How do we access subsets of tumors? We can use column names or row and column numbers.
tumors$wt
## [1] 5 6 7
tumors$ko
## [1] 8 9 11
tumors[1:3,1]
## [1] 5 6 7
tumors[1:2,1:2]
## wt ko
## 1 5 8
## 2 6 9
tumors[,1]
## [1] 5 6 7
t.test(tumors$wt, tumors$ko)
##
## Welch Two Sample t-test
##
## data: tumors$wt and tumors$ko
## t = -3.1623, df = 3.4483, p-value = 0.04191
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -6.4542173 -0.2124494
## sample estimates:
## mean of x mean of y
## 6.000000 9.333333
###5 Creating an output table Unless we have just a little data, we may want to run several analyses and create a le to hold the output. One way to do that is to create a table the hold the output data. In this case, let’s say we want to try two variations of two statistical tests. We should note that normally we’d decide on a statistical test when designing an experiment, rather than trying a bunch of them after we already have the data. Let’s start by creating an empty matrix of two rows and two columns, labeled to show what we want each cell to hold.
pvals.out = matrix(data=NA, ncol=2, nrow=2)
colnames(pvals.out) = c("two.tail", "one.tail")
rownames(pvals.out) = c("Welch", "Wilcoxon")
pvals.out
## two.tail one.tail
## Welch NA NA
## Wilcoxon NA NA
Now let’s run some statistical tests and put the output p-values into our table. We’ll start with Welch’s test for the first row, once as a two-sided test and once as a one-sided test.
pvals.out[1,1] = t.test(tumors$wt, tumors$ko)$p.value
pvals.out[1,2] = t.test(tumors$wt, tumors$ko, alt="less")$p.value
Now let’s try the Wilcoxon rank sum test, a non-parametric alternative to the t-test.
pvals.out[2,1] = wilcox.test(tumors$wt, tumors$ko)$p.value
pvals.out[2,2] = wilcox.test(tumors$wt, tumors$ko, alt="less")$p.value
pvals.out
## two.tail one.tail
## Welch 0.04191452 0.02095726
## Wilcoxon 0.10000000 0.05000000
Since we have row names, we can also use them to access subsets of our matrix
pvals.out["Welch",]
## two.tail one.tail
## 0.04191452 0.02095726
###6 Printing an output table Our table looks fine, except for the fact that we have so many digits that some are meaningless. We could either round them now or later. If we choose to do it now, set the number of significant figures:
pvals.out.rounded = signif(pvals.out, 3)
pvals.out.rounded
## two.tail one.tail
## Welch 0.0419 0.021
## Wilcoxon 0.1000 0.050
Now let’s print the table.
write.table(pvals.out.rounded, file="tumor_pvals.txt", sep="\t", quote=F)
We’ve included our output filename, that we want the file to be tab-delimited (sep=“”), and that we don’t want quotes around any text (quote=F). Now we can open “Tumor_pvals.txt” in any text editor or Excel. If we do so in Excel, however, our column labels will need to be shifted over one column. To avoid this, we could have created a 3-column table, with a first column containing “Welch” and “Wilcoxon” { and then print the table with the additional option row.names=F.
###7 Creating figures R is very powerful and very exible with its figure generation. Besides statistics, graphics are another great reason to learn R. We can start by creating a simple boxplot of our data.
boxplot(tumors)
Normally if we only had three points of data, a scatterplot would be more informative than a boxplot. Nevertheless, let’s add some more details the make the figure more readable.
boxplot(tumors, col=c("gray", "red"), main="MFG appears to be a tumor suppressor", ylab="number of tumors")
If we run R on our computer, a figure will pop up and can then be saved by right-clicking on it. On tak, a figure will end up in a file called “Rplots.pdf”.
Normally it’s more useful to include filenames in our code, as this makes it more reproducible and more automated. To do this, we need to first issue a command telling R what graphical format to use. Then we write our plotting command(s). A PDF file can hold multiple figures, by default one per page. Finally, we need to “close” the file when we’re finished with it.
#Create a PDF file 11 inches wide and 8.5 inches high
pdf("tumor_boxplot.pdf", w=11, h=8.5)
boxplot(tumors)
dev.off()
## png
## 2
Another common format is PNG. The syntax is similar, except the figure dimensions are in pixels rather than inches.
png("tumor_boxplot.png", w=1800, h=1200)
boxplot(tumors)
dev.off()
## png
## 2
###8 Accessing very low p-values R has an unexpected habit of rounding very low p-values to “< 2.2e-16”, but we can do better than that approximation (if desired). Let’s start by creating some very extreme values.
a = 1:10
a
## [1] 1 2 3 4 5 6 7 8 9 10
b = a + 1000
b
## [1] 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010
Now let’s run Welch’s two sample t-test. Then let’s run the same test but explicitly ask for the p-value.
t.test(a,b)
##
## Welch Two Sample t-test
##
## data: a and b
## t = -738.55, df = 18, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1002.8447 -997.1553
## sample estimates:
## mean of x mean of y
## 5.5 1005.5
t.test(a,b)$p.value
## [1] 8.605537e-42
Note that we get a much more precise p-value using the second commmand. We may want to round the value with a modified command like,
signif(t.test(a,b)$p.value, 1)
## [1] 9e-42
###9 Ending a session Let’s save all of our commands. We may want to edit out the not-so-good ones later, but at least we have them all.
#savehistory(file="R_intro_Hot_Topics_commands.R")
#Note: savehistory will not work when knit'd
What variables do we have so far?
ls()
## [1] "a" "b" "ko"
## [4] "pvals.out" "pvals.out.rounded" "tumors"
## [7] "wt" "wt.vs.ko"
And what files have we created?
dir(pattern="tumor*")
## [1] "tumor_boxplot.pdf" "tumor_boxplot.png" "tumor_pvals.txt"
In the interests of totally reproducible research, we’ll end by printing our R environment.
date()
## [1] "Thu Nov 14 11:19:21 2024"
print(sessionInfo(), locale=FALSE)
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.6 LTS
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.33 R6_2.5.1 lifecycle_1.0.4 jsonlite_1.8.8
## [5] evaluate_0.23 highr_0.10 cachem_1.0.8 rlang_1.1.2
## [9] cli_3.6.1 rstudioapi_0.15.0 jquerylib_0.1.4 bslib_0.6.1
## [13] rmarkdown_2.25 tools_4.2.1 xfun_0.41 yaml_2.3.7
## [17] fastmap_1.1.1 compiler_4.2.1 htmltools_0.5.7 knitr_1.45
## [21] sass_0.4.8
This is just a start! There’s lots more….
To exit R, type q()
If you save the workspace image, you can start R again in about the same state as you are now. Then when you re-start R in the same directory, files called .RData and .Rhistory will reload of your data and command history.
###Part II: tidyverse
Tidyverse is suite of packages, written primarily by Hadley Wickham, for data analysis. We’ll use three packages: readr, tidyr and dplyr, which can make accessing/analyzing data easier than base R.
We’ll first load the packages,
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(readr)
library(tidyr)
Read in the data file,
data<-read_tsv("http://barc.wi.mit.edu/education/hot_topics/Intro_to_R_2016/normalizedCounts_subset.txt")
## Rows: 11 Columns: 31
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (1): Gene
## dbl (30): HMLE_1, HMLE_2, HMLE_3, N8_1, N8_2, N8_3, N8_lo_1, N8_lo_2, N8_lo_...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(data)
## # A tibble: 6 × 31
## Gene HMLE_1 HMLE_2 HMLE_3 N8_1 N8_2 N8_3 N8_lo_1 N8_lo_2 N8_lo_3 N8_lo_4
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 C1orf43 2639. 2637 2140. 1974. 2325. 2721. 12383. 11561. 12654. 12025.
## 2 CHMP2A 396. 375. 308. 386. 390. 401. 3746. 3334. 3474. 3485.
## 3 EMC7 627. 659. 498. 670. 939. 624. 3926. 3887. 4368. 3674.
## 4 GPI 5739. 5868. 7034. 5778. 4972. 5461. 24078. 18815. 21059. 23859.
## 5 PSMB2 2770. 3052. 2986. 2989. 2574. 2113. 10581. 10733. 11364. 10265.
## 6 PSMB4 2844. 3279. 2797. 3216. 3303. 2677. 26545. 23818. 27669. 26145.
## # ℹ 20 more variables: N8_lo_5 <dbl>, N8_lo_6 <dbl>, N8_hi_1 <dbl>,
## # N8_hi_2 <dbl>, N8_hi_3 <dbl>, N8_hi_4 <dbl>, N8_hi_5 <dbl>, N8_hi_6 <dbl>,
## # `2020_159_hi` <dbl>, `2021_159_lo` <dbl>, `2051_159_hi` <dbl>,
## # `2052_159_lo` <dbl>, `2064_159_lo` <dbl>, `2065_159_hi` <dbl>,
## # `2479_231_lo` <dbl>, `2480_231_hi` <dbl>, `2499_231_lo` <dbl>,
## # `2500_231_hi` <dbl>, `2513_231_hi` <dbl>, `2514_231_lo` <dbl>
We’ll now access/query certain columns using dplyr and show its functionality,
#columns 1 to 4
data_HMLE<- select(data, 1:4)
data_HMLE <- data %>% select(1:4)
head(data_HMLE)
## # A tibble: 6 × 4
## Gene HMLE_1 HMLE_2 HMLE_3
## <chr> <dbl> <dbl> <dbl>
## 1 C1orf43 2639. 2637 2140.
## 2 CHMP2A 396. 375. 308.
## 3 EMC7 627. 659. 498.
## 4 GPI 5739. 5868. 7034.
## 5 PSMB2 2770. 3052. 2986.
## 6 PSMB4 2844. 3279. 2797.
#only column "Gene"
data_rowNames <- select(data,Gene)
head(data_rowNames)
## # A tibble: 6 × 1
## Gene
## <chr>
## 1 C1orf43
## 2 CHMP2A
## 3 EMC7
## 4 GPI
## 5 PSMB2
## 6 PSMB4
#rename Column Gene to Gens
data_rowNames <- select(data,Genes=Gene)
head(data_rowNames)
## # A tibble: 6 × 1
## Genes
## <chr>
## 1 C1orf43
## 2 CHMP2A
## 3 EMC7
## 4 GPI
## 5 PSMB2
## 6 PSMB4
#Columns Gene to N8_3 (by column name)
data_range <- select(data, Gene:N8_3)
head(data_range)
## # A tibble: 6 × 7
## Gene HMLE_1 HMLE_2 HMLE_3 N8_1 N8_2 N8_3
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 C1orf43 2639. 2637 2140. 1974. 2325. 2721.
## 2 CHMP2A 396. 375. 308. 386. 390. 401.
## 3 EMC7 627. 659. 498. 670. 939. 624.
## 4 GPI 5739. 5868. 7034. 5778. 4972. 5461.
## 5 PSMB2 2770. 3052. 2986. 2989. 2574. 2113.
## 6 PSMB4 2844. 3279. 2797. 3216. 3303. 2677.
#All columns except Gene
data_noGene <- select(data,-Gene)
head(data_noGene)
## # A tibble: 6 × 30
## HMLE_1 HMLE_2 HMLE_3 N8_1 N8_2 N8_3 N8_lo_1 N8_lo_2 N8_lo_3 N8_lo_4 N8_lo_5
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2639. 2637 2140. 1974. 2325. 2721. 12383. 11561. 12654. 12025. 11570.
## 2 396. 375. 308. 386. 390. 401. 3746. 3334. 3474. 3485. 3196.
## 3 627. 659. 498. 670. 939. 624. 3926. 3887. 4368. 3674. 3862.
## 4 5739. 5868. 7034. 5778. 4972. 5461. 24078. 18815. 21059. 23859. 18633.
## 5 2770. 3052. 2986. 2989. 2574. 2113. 10581. 10733. 11364. 10265. 10857.
## 6 2844. 3279. 2797. 3216. 3303. 2677. 26545. 23818. 27669. 26145. 23917.
## # ℹ 19 more variables: N8_lo_6 <dbl>, N8_hi_1 <dbl>, N8_hi_2 <dbl>,
## # N8_hi_3 <dbl>, N8_hi_4 <dbl>, N8_hi_5 <dbl>, N8_hi_6 <dbl>,
## # `2020_159_hi` <dbl>, `2021_159_lo` <dbl>, `2051_159_hi` <dbl>,
## # `2052_159_lo` <dbl>, `2064_159_lo` <dbl>, `2065_159_hi` <dbl>,
## # `2479_231_lo` <dbl>, `2480_231_hi` <dbl>, `2499_231_lo` <dbl>,
## # `2500_231_hi` <dbl>, `2513_231_hi` <dbl>, `2514_231_lo` <dbl>
#select columns with "lo" and calculate the mean
#?contains
data_lo <- select(data, contains("lo"))
head(data_lo)
## # A tibble: 6 × 12
## N8_lo_1 N8_lo_2 N8_lo_3 N8_lo_4 N8_lo_5 N8_lo_6 `2021_159_lo` `2052_159_lo`
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 12383. 11561. 12654. 12025. 11570. 12584. 4041. 6757.
## 2 3746. 3334. 3474. 3485. 3196. 3505. 1114. 1672.
## 3 3926. 3887. 4368. 3674. 3862. 4361. 1269. 1512.
## 4 24078. 18815. 21059. 23859. 18633. 21224. 7659. 5312.
## 5 10581. 10733. 11364. 10265. 10857. 11163. 3630. 2144.
## 6 26545. 23818. 27669. 26145. 23917. 27725. 3967. 3550
## # ℹ 4 more variables: `2064_159_lo` <dbl>, `2479_231_lo` <dbl>,
## # `2499_231_lo` <dbl>, `2514_231_lo` <dbl>
data_lo <- select(data, matches('lo|Gene'))
head(data_lo)
## # A tibble: 6 × 13
## Gene N8_lo_1 N8_lo_2 N8_lo_3 N8_lo_4 N8_lo_5 N8_lo_6 `2021_159_lo`
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 C1orf43 12383. 11561. 12654. 12025. 11570. 12584. 4041.
## 2 CHMP2A 3746. 3334. 3474. 3485. 3196. 3505. 1114.
## 3 EMC7 3926. 3887. 4368. 3674. 3862. 4361. 1269.
## 4 GPI 24078. 18815. 21059. 23859. 18633. 21224. 7659.
## 5 PSMB2 10581. 10733. 11364. 10265. 10857. 11163. 3630.
## 6 PSMB4 26545. 23818. 27669. 26145. 23917. 27725. 3967.
## # ℹ 5 more variables: `2052_159_lo` <dbl>, `2064_159_lo` <dbl>,
## # `2479_231_lo` <dbl>, `2499_231_lo` <dbl>, `2514_231_lo` <dbl>
data_lo_mean <- select(data_lo,-Gene) %>% mutate(average=rowMeans(.))
data_lo_mean
## # A tibble: 11 × 13
## N8_lo_1 N8_lo_2 N8_lo_3 N8_lo_4 N8_lo_5 N8_lo_6 `2021_159_lo` `2052_159_lo`
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 12383. 11561. 12654. 12025. 11570. 12584. 4041. 6757.
## 2 3746. 3334. 3474. 3485. 3196. 3505. 1114. 1672.
## 3 3926. 3887. 4368. 3674. 3862. 4361. 1269. 1512.
## 4 24078. 18815. 21059. 23859. 18633. 21224. 7659. 5312.
## 5 10581. 10733. 11364. 10265. 10857. 11163. 3630. 2144.
## 6 26545. 23818. 27669. 26145. 23917. 27725. 3967. 3550
## 7 4207. 4423. 4262. 4037. 4545. 4282. 5842. 4290.
## 8 5427. 5190. 5698. 5552. 5079. 5800. 2484. 2056.
## 9 1280. 1442. 1244. 1262. 1430. 1189. 1995. 1151.
## 10 20143. 19895. 21588. 20532. 19714. 21545. 9019. 3533.
## 11 1896. 2487. 2181. 1913. 2501. 2215. 1320. 620.
## # ℹ 5 more variables: `2064_159_lo` <dbl>, `2479_231_lo` <dbl>,
## # `2499_231_lo` <dbl>, `2514_231_lo` <dbl>, average <dbl>
dplyr has function group_by and summarise which can be very helpful in grouping/summarising data
#Gather data to make it 'tidy'
data2<-data %>% gather(Sample, Value, HMLE_1:`2514_231_lo`)
knitr::kable(data)
Gene | HMLE_1 | HMLE_2 | HMLE_3 | N8_1 | N8_2 | N8_3 | N8_lo_1 | N8_lo_2 | N8_lo_3 | N8_lo_4 | N8_lo_5 | N8_lo_6 | N8_hi_1 | N8_hi_2 | N8_hi_3 | N8_hi_4 | N8_hi_5 | N8_hi_6 | 2020_159_hi | 2021_159_lo | 2051_159_hi | 2052_159_lo | 2064_159_lo | 2065_159_hi | 2479_231_lo | 2480_231_hi | 2499_231_lo | 2500_231_hi | 2513_231_hi | 2514_231_lo |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C1orf43 | 2638.63 | 2637.00 | 2139.90 | 1974.41 | 2324.54 | 2720.88 | 12383.21 | 11561.25 | 12653.96 | 12025.11 | 11569.77 | 12584.14 | 10748.37 | 11580.84 | 11820.45 | 10911.87 | 11559.48 | 12014.27 | 4721.38 | 4040.94 | 6221.58 | 6756.85 | 6799.84 | 6313.16 | 12554.60 | 9495.91 | 10355.72 | 9697.18 | 7085.38 | 14096.89 |
CHMP2A | 396.12 | 374.64 | 308.01 | 385.71 | 390.24 | 401.35 | 3745.64 | 3333.62 | 3474.17 | 3485.49 | 3195.98 | 3505.24 | 3250.63 | 3447.26 | 3273.25 | 3254.42 | 3453.26 | 3220.14 | 1152.03 | 1114.13 | 1539.93 | 1671.94 | 1653.83 | 1525.45 | 3257.41 | 4617.31 | 2688.81 | 4625.35 | 2265.31 | 3231.16 |
EMC7 | 627.20 | 659.25 | 497.97 | 670.10 | 939.40 | 624.18 | 3926.17 | 3886.75 | 4368.18 | 3673.52 | 3861.96 | 4360.74 | 4224.40 | 4401.20 | 4717.53 | 4091.59 | 4324.93 | 4737.03 | 1036.89 | 1268.55 | 1031.23 | 1512.44 | 1301.88 | 1011.79 | 1932.92 | 1911.13 | 1600.03 | 1997.40 | 1689.02 | 1917.48 |
GPI | 5739.40 | 5868.39 | 7034.34 | 5778.32 | 4972.38 | 5461.45 | 24077.59 | 18815.02 | 21058.83 | 23858.70 | 18632.68 | 21223.80 | 20728.80 | 14331.97 | 19301.18 | 21043.71 | 14461.49 | 19351.92 | 7651.48 | 7659.03 | 6225.22 | 5312.36 | 5250.23 | 5892.63 | 20824.48 | 19719.69 | 11163.31 | 17936.13 | 9238.90 | 16378.83 |
PSMB2 | 2770.12 | 3052.46 | 2986.46 | 2989.33 | 2574.36 | 2112.81 | 10581.38 | 10733.41 | 11363.88 | 10264.95 | 10856.66 | 11163.09 | 9971.91 | 10309.92 | 10846.72 | 10176.13 | 10261.09 | 11339.53 | 2992.39 | 3630.22 | 1982.40 | 2143.70 | 2273.16 | 2088.62 | 8792.90 | 9162.77 | 6029.00 | 9473.94 | 6233.50 | 8740.18 |
PSMB4 | 2843.84 | 3278.91 | 2797.41 | 3216.03 | 3302.58 | 2677.03 | 26545.47 | 23817.93 | 27668.51 | 26144.64 | 23917.14 | 27724.72 | 22598.26 | 24148.99 | 26886.67 | 22290.90 | 23938.99 | 26663.68 | 3796.48 | 3967.37 | 3411.72 | 3550.00 | 3617.75 | 3177.27 | 20726.20 | 18808.92 | 12314.58 | 18363.02 | 10926.18 | 19297.21 |
RAB7A | 3471.04 | 3693.25 | 4161.31 | 3209.67 | 3022.29 | 2965.63 | 4206.61 | 4422.55 | 4261.99 | 4037.39 | 4545.31 | 4281.79 | 3386.90 | 4530.12 | 3187.38 | 3590.15 | 4400.86 | 3339.22 | 7583.52 | 5842.31 | 6789.95 | 4289.65 | 3540.02 | 5098.12 | 3665.75 | 4083.54 | 4963.89 | 4437.36 | 4479.49 | 5209.84 |
REEP5 | 2844.94 | 2601.21 | 2491.71 | 2723.11 | 3065.28 | 2949.52 | 5426.53 | 5189.75 | 5698.44 | 5552.06 | 5078.59 | 5800.45 | 4905.75 | 5547.76 | 5747.94 | 5083.06 | 5620.29 | 5901.87 | 1449.00 | 2483.75 | 1165.14 | 2055.53 | 1353.13 | 1071.66 | 3824.88 | 3912.77 | 2905.62 | 3903.41 | 2800.87 | 4275.58 |
SNRPD3 | 1580.10 | 1821.74 | 1493.00 | 1463.32 | 1213.16 | 642.52 | 1279.51 | 1441.60 | 1244.16 | 1262.23 | 1429.94 | 1188.51 | 1006.42 | 1501.44 | 1169.51 | 1062.90 | 1479.10 | 1104.44 | 1537.09 | 1994.60 | 1179.69 | 1150.76 | 800.43 | 914.23 | 654.06 | 684.94 | 782.03 | 731.07 | 1493.16 | 787.39 |
VCP | 8162.37 | 7444.10 | 8229.02 | 7667.78 | 8802.89 | 7981.43 | 20142.66 | 19895.29 | 21588.35 | 20531.65 | 19714.13 | 21545.33 | 22224.94 | 19112.33 | 19903.97 | 22435.19 | 19222.26 | 20376.98 | 6020.01 | 9018.95 | 3701.36 | 3532.59 | 4122.61 | 3993.95 | 25282.36 | 23514.88 | 18529.14 | 23856.52 | 14987.10 | 19691.58 |
VPS29 | 2404.81 | 3198.40 | 1956.39 | 1900.82 | 2673.96 | 1332.03 | 1896.48 | 2487.22 | 2181.22 | 1913.36 | 2501.47 | 2214.82 | 1797.07 | 2789.05 | 2287.50 | 1668.64 | 2709.16 | 2210.61 | 806.61 | 1320.30 | 566.92 | 620.03 | 392.96 | 381.36 | 2130.65 | 1242.98 | 2666.09 | 1359.02 | 1741.01 | 2658.64 |
knitr::kable(data2)
Gene | Sample | Value |
---|---|---|
C1orf43 | HMLE_1 | 2638.63 |
CHMP2A | HMLE_1 | 396.12 |
EMC7 | HMLE_1 | 627.20 |
GPI | HMLE_1 | 5739.40 |
PSMB2 | HMLE_1 | 2770.12 |
PSMB4 | HMLE_1 | 2843.84 |
RAB7A | HMLE_1 | 3471.04 |
REEP5 | HMLE_1 | 2844.94 |
SNRPD3 | HMLE_1 | 1580.10 |
VCP | HMLE_1 | 8162.37 |
VPS29 | HMLE_1 | 2404.81 |
C1orf43 | HMLE_2 | 2637.00 |
CHMP2A | HMLE_2 | 374.64 |
EMC7 | HMLE_2 | 659.25 |
GPI | HMLE_2 | 5868.39 |
PSMB2 | HMLE_2 | 3052.46 |
PSMB4 | HMLE_2 | 3278.91 |
RAB7A | HMLE_2 | 3693.25 |
REEP5 | HMLE_2 | 2601.21 |
SNRPD3 | HMLE_2 | 1821.74 |
VCP | HMLE_2 | 7444.10 |
VPS29 | HMLE_2 | 3198.40 |
C1orf43 | HMLE_3 | 2139.90 |
CHMP2A | HMLE_3 | 308.01 |
EMC7 | HMLE_3 | 497.97 |
GPI | HMLE_3 | 7034.34 |
PSMB2 | HMLE_3 | 2986.46 |
PSMB4 | HMLE_3 | 2797.41 |
RAB7A | HMLE_3 | 4161.31 |
REEP5 | HMLE_3 | 2491.71 |
SNRPD3 | HMLE_3 | 1493.00 |
VCP | HMLE_3 | 8229.02 |
VPS29 | HMLE_3 | 1956.39 |
C1orf43 | N8_1 | 1974.41 |
CHMP2A | N8_1 | 385.71 |
EMC7 | N8_1 | 670.10 |
GPI | N8_1 | 5778.32 |
PSMB2 | N8_1 | 2989.33 |
PSMB4 | N8_1 | 3216.03 |
RAB7A | N8_1 | 3209.67 |
REEP5 | N8_1 | 2723.11 |
SNRPD3 | N8_1 | 1463.32 |
VCP | N8_1 | 7667.78 |
VPS29 | N8_1 | 1900.82 |
C1orf43 | N8_2 | 2324.54 |
CHMP2A | N8_2 | 390.24 |
EMC7 | N8_2 | 939.40 |
GPI | N8_2 | 4972.38 |
PSMB2 | N8_2 | 2574.36 |
PSMB4 | N8_2 | 3302.58 |
RAB7A | N8_2 | 3022.29 |
REEP5 | N8_2 | 3065.28 |
SNRPD3 | N8_2 | 1213.16 |
VCP | N8_2 | 8802.89 |
VPS29 | N8_2 | 2673.96 |
C1orf43 | N8_3 | 2720.88 |
CHMP2A | N8_3 | 401.35 |
EMC7 | N8_3 | 624.18 |
GPI | N8_3 | 5461.45 |
PSMB2 | N8_3 | 2112.81 |
PSMB4 | N8_3 | 2677.03 |
RAB7A | N8_3 | 2965.63 |
REEP5 | N8_3 | 2949.52 |
SNRPD3 | N8_3 | 642.52 |
VCP | N8_3 | 7981.43 |
VPS29 | N8_3 | 1332.03 |
C1orf43 | N8_lo_1 | 12383.21 |
CHMP2A | N8_lo_1 | 3745.64 |
EMC7 | N8_lo_1 | 3926.17 |
GPI | N8_lo_1 | 24077.59 |
PSMB2 | N8_lo_1 | 10581.38 |
PSMB4 | N8_lo_1 | 26545.47 |
RAB7A | N8_lo_1 | 4206.61 |
REEP5 | N8_lo_1 | 5426.53 |
SNRPD3 | N8_lo_1 | 1279.51 |
VCP | N8_lo_1 | 20142.66 |
VPS29 | N8_lo_1 | 1896.48 |
C1orf43 | N8_lo_2 | 11561.25 |
CHMP2A | N8_lo_2 | 3333.62 |
EMC7 | N8_lo_2 | 3886.75 |
GPI | N8_lo_2 | 18815.02 |
PSMB2 | N8_lo_2 | 10733.41 |
PSMB4 | N8_lo_2 | 23817.93 |
RAB7A | N8_lo_2 | 4422.55 |
REEP5 | N8_lo_2 | 5189.75 |
SNRPD3 | N8_lo_2 | 1441.60 |
VCP | N8_lo_2 | 19895.29 |
VPS29 | N8_lo_2 | 2487.22 |
C1orf43 | N8_lo_3 | 12653.96 |
CHMP2A | N8_lo_3 | 3474.17 |
EMC7 | N8_lo_3 | 4368.18 |
GPI | N8_lo_3 | 21058.83 |
PSMB2 | N8_lo_3 | 11363.88 |
PSMB4 | N8_lo_3 | 27668.51 |
RAB7A | N8_lo_3 | 4261.99 |
REEP5 | N8_lo_3 | 5698.44 |
SNRPD3 | N8_lo_3 | 1244.16 |
VCP | N8_lo_3 | 21588.35 |
VPS29 | N8_lo_3 | 2181.22 |
C1orf43 | N8_lo_4 | 12025.11 |
CHMP2A | N8_lo_4 | 3485.49 |
EMC7 | N8_lo_4 | 3673.52 |
GPI | N8_lo_4 | 23858.70 |
PSMB2 | N8_lo_4 | 10264.95 |
PSMB4 | N8_lo_4 | 26144.64 |
RAB7A | N8_lo_4 | 4037.39 |
REEP5 | N8_lo_4 | 5552.06 |
SNRPD3 | N8_lo_4 | 1262.23 |
VCP | N8_lo_4 | 20531.65 |
VPS29 | N8_lo_4 | 1913.36 |
C1orf43 | N8_lo_5 | 11569.77 |
CHMP2A | N8_lo_5 | 3195.98 |
EMC7 | N8_lo_5 | 3861.96 |
GPI | N8_lo_5 | 18632.68 |
PSMB2 | N8_lo_5 | 10856.66 |
PSMB4 | N8_lo_5 | 23917.14 |
RAB7A | N8_lo_5 | 4545.31 |
REEP5 | N8_lo_5 | 5078.59 |
SNRPD3 | N8_lo_5 | 1429.94 |
VCP | N8_lo_5 | 19714.13 |
VPS29 | N8_lo_5 | 2501.47 |
C1orf43 | N8_lo_6 | 12584.14 |
CHMP2A | N8_lo_6 | 3505.24 |
EMC7 | N8_lo_6 | 4360.74 |
GPI | N8_lo_6 | 21223.80 |
PSMB2 | N8_lo_6 | 11163.09 |
PSMB4 | N8_lo_6 | 27724.72 |
RAB7A | N8_lo_6 | 4281.79 |
REEP5 | N8_lo_6 | 5800.45 |
SNRPD3 | N8_lo_6 | 1188.51 |
VCP | N8_lo_6 | 21545.33 |
VPS29 | N8_lo_6 | 2214.82 |
C1orf43 | N8_hi_1 | 10748.37 |
CHMP2A | N8_hi_1 | 3250.63 |
EMC7 | N8_hi_1 | 4224.40 |
GPI | N8_hi_1 | 20728.80 |
PSMB2 | N8_hi_1 | 9971.91 |
PSMB4 | N8_hi_1 | 22598.26 |
RAB7A | N8_hi_1 | 3386.90 |
REEP5 | N8_hi_1 | 4905.75 |
SNRPD3 | N8_hi_1 | 1006.42 |
VCP | N8_hi_1 | 22224.94 |
VPS29 | N8_hi_1 | 1797.07 |
C1orf43 | N8_hi_2 | 11580.84 |
CHMP2A | N8_hi_2 | 3447.26 |
EMC7 | N8_hi_2 | 4401.20 |
GPI | N8_hi_2 | 14331.97 |
PSMB2 | N8_hi_2 | 10309.92 |
PSMB4 | N8_hi_2 | 24148.99 |
RAB7A | N8_hi_2 | 4530.12 |
REEP5 | N8_hi_2 | 5547.76 |
SNRPD3 | N8_hi_2 | 1501.44 |
VCP | N8_hi_2 | 19112.33 |
VPS29 | N8_hi_2 | 2789.05 |
C1orf43 | N8_hi_3 | 11820.45 |
CHMP2A | N8_hi_3 | 3273.25 |
EMC7 | N8_hi_3 | 4717.53 |
GPI | N8_hi_3 | 19301.18 |
PSMB2 | N8_hi_3 | 10846.72 |
PSMB4 | N8_hi_3 | 26886.67 |
RAB7A | N8_hi_3 | 3187.38 |
REEP5 | N8_hi_3 | 5747.94 |
SNRPD3 | N8_hi_3 | 1169.51 |
VCP | N8_hi_3 | 19903.97 |
VPS29 | N8_hi_3 | 2287.50 |
C1orf43 | N8_hi_4 | 10911.87 |
CHMP2A | N8_hi_4 | 3254.42 |
EMC7 | N8_hi_4 | 4091.59 |
GPI | N8_hi_4 | 21043.71 |
PSMB2 | N8_hi_4 | 10176.13 |
PSMB4 | N8_hi_4 | 22290.90 |
RAB7A | N8_hi_4 | 3590.15 |
REEP5 | N8_hi_4 | 5083.06 |
SNRPD3 | N8_hi_4 | 1062.90 |
VCP | N8_hi_4 | 22435.19 |
VPS29 | N8_hi_4 | 1668.64 |
C1orf43 | N8_hi_5 | 11559.48 |
CHMP2A | N8_hi_5 | 3453.26 |
EMC7 | N8_hi_5 | 4324.93 |
GPI | N8_hi_5 | 14461.49 |
PSMB2 | N8_hi_5 | 10261.09 |
PSMB4 | N8_hi_5 | 23938.99 |
RAB7A | N8_hi_5 | 4400.86 |
REEP5 | N8_hi_5 | 5620.29 |
SNRPD3 | N8_hi_5 | 1479.10 |
VCP | N8_hi_5 | 19222.26 |
VPS29 | N8_hi_5 | 2709.16 |
C1orf43 | N8_hi_6 | 12014.27 |
CHMP2A | N8_hi_6 | 3220.14 |
EMC7 | N8_hi_6 | 4737.03 |
GPI | N8_hi_6 | 19351.92 |
PSMB2 | N8_hi_6 | 11339.53 |
PSMB4 | N8_hi_6 | 26663.68 |
RAB7A | N8_hi_6 | 3339.22 |
REEP5 | N8_hi_6 | 5901.87 |
SNRPD3 | N8_hi_6 | 1104.44 |
VCP | N8_hi_6 | 20376.98 |
VPS29 | N8_hi_6 | 2210.61 |
C1orf43 | 2020_159_hi | 4721.38 |
CHMP2A | 2020_159_hi | 1152.03 |
EMC7 | 2020_159_hi | 1036.89 |
GPI | 2020_159_hi | 7651.48 |
PSMB2 | 2020_159_hi | 2992.39 |
PSMB4 | 2020_159_hi | 3796.48 |
RAB7A | 2020_159_hi | 7583.52 |
REEP5 | 2020_159_hi | 1449.00 |
SNRPD3 | 2020_159_hi | 1537.09 |
VCP | 2020_159_hi | 6020.01 |
VPS29 | 2020_159_hi | 806.61 |
C1orf43 | 2021_159_lo | 4040.94 |
CHMP2A | 2021_159_lo | 1114.13 |
EMC7 | 2021_159_lo | 1268.55 |
GPI | 2021_159_lo | 7659.03 |
PSMB2 | 2021_159_lo | 3630.22 |
PSMB4 | 2021_159_lo | 3967.37 |
RAB7A | 2021_159_lo | 5842.31 |
REEP5 | 2021_159_lo | 2483.75 |
SNRPD3 | 2021_159_lo | 1994.60 |
VCP | 2021_159_lo | 9018.95 |
VPS29 | 2021_159_lo | 1320.30 |
C1orf43 | 2051_159_hi | 6221.58 |
CHMP2A | 2051_159_hi | 1539.93 |
EMC7 | 2051_159_hi | 1031.23 |
GPI | 2051_159_hi | 6225.22 |
PSMB2 | 2051_159_hi | 1982.40 |
PSMB4 | 2051_159_hi | 3411.72 |
RAB7A | 2051_159_hi | 6789.95 |
REEP5 | 2051_159_hi | 1165.14 |
SNRPD3 | 2051_159_hi | 1179.69 |
VCP | 2051_159_hi | 3701.36 |
VPS29 | 2051_159_hi | 566.92 |
C1orf43 | 2052_159_lo | 6756.85 |
CHMP2A | 2052_159_lo | 1671.94 |
EMC7 | 2052_159_lo | 1512.44 |
GPI | 2052_159_lo | 5312.36 |
PSMB2 | 2052_159_lo | 2143.70 |
PSMB4 | 2052_159_lo | 3550.00 |
RAB7A | 2052_159_lo | 4289.65 |
REEP5 | 2052_159_lo | 2055.53 |
SNRPD3 | 2052_159_lo | 1150.76 |
VCP | 2052_159_lo | 3532.59 |
VPS29 | 2052_159_lo | 620.03 |
C1orf43 | 2064_159_lo | 6799.84 |
CHMP2A | 2064_159_lo | 1653.83 |
EMC7 | 2064_159_lo | 1301.88 |
GPI | 2064_159_lo | 5250.23 |
PSMB2 | 2064_159_lo | 2273.16 |
PSMB4 | 2064_159_lo | 3617.75 |
RAB7A | 2064_159_lo | 3540.02 |
REEP5 | 2064_159_lo | 1353.13 |
SNRPD3 | 2064_159_lo | 800.43 |
VCP | 2064_159_lo | 4122.61 |
VPS29 | 2064_159_lo | 392.96 |
C1orf43 | 2065_159_hi | 6313.16 |
CHMP2A | 2065_159_hi | 1525.45 |
EMC7 | 2065_159_hi | 1011.79 |
GPI | 2065_159_hi | 5892.63 |
PSMB2 | 2065_159_hi | 2088.62 |
PSMB4 | 2065_159_hi | 3177.27 |
RAB7A | 2065_159_hi | 5098.12 |
REEP5 | 2065_159_hi | 1071.66 |
SNRPD3 | 2065_159_hi | 914.23 |
VCP | 2065_159_hi | 3993.95 |
VPS29 | 2065_159_hi | 381.36 |
C1orf43 | 2479_231_lo | 12554.60 |
CHMP2A | 2479_231_lo | 3257.41 |
EMC7 | 2479_231_lo | 1932.92 |
GPI | 2479_231_lo | 20824.48 |
PSMB2 | 2479_231_lo | 8792.90 |
PSMB4 | 2479_231_lo | 20726.20 |
RAB7A | 2479_231_lo | 3665.75 |
REEP5 | 2479_231_lo | 3824.88 |
SNRPD3 | 2479_231_lo | 654.06 |
VCP | 2479_231_lo | 25282.36 |
VPS29 | 2479_231_lo | 2130.65 |
C1orf43 | 2480_231_hi | 9495.91 |
CHMP2A | 2480_231_hi | 4617.31 |
EMC7 | 2480_231_hi | 1911.13 |
GPI | 2480_231_hi | 19719.69 |
PSMB2 | 2480_231_hi | 9162.77 |
PSMB4 | 2480_231_hi | 18808.92 |
RAB7A | 2480_231_hi | 4083.54 |
REEP5 | 2480_231_hi | 3912.77 |
SNRPD3 | 2480_231_hi | 684.94 |
VCP | 2480_231_hi | 23514.88 |
VPS29 | 2480_231_hi | 1242.98 |
C1orf43 | 2499_231_lo | 10355.72 |
CHMP2A | 2499_231_lo | 2688.81 |
EMC7 | 2499_231_lo | 1600.03 |
GPI | 2499_231_lo | 11163.31 |
PSMB2 | 2499_231_lo | 6029.00 |
PSMB4 | 2499_231_lo | 12314.58 |
RAB7A | 2499_231_lo | 4963.89 |
REEP5 | 2499_231_lo | 2905.62 |
SNRPD3 | 2499_231_lo | 782.03 |
VCP | 2499_231_lo | 18529.14 |
VPS29 | 2499_231_lo | 2666.09 |
C1orf43 | 2500_231_hi | 9697.18 |
CHMP2A | 2500_231_hi | 4625.35 |
EMC7 | 2500_231_hi | 1997.40 |
GPI | 2500_231_hi | 17936.13 |
PSMB2 | 2500_231_hi | 9473.94 |
PSMB4 | 2500_231_hi | 18363.02 |
RAB7A | 2500_231_hi | 4437.36 |
REEP5 | 2500_231_hi | 3903.41 |
SNRPD3 | 2500_231_hi | 731.07 |
VCP | 2500_231_hi | 23856.52 |
VPS29 | 2500_231_hi | 1359.02 |
C1orf43 | 2513_231_hi | 7085.38 |
CHMP2A | 2513_231_hi | 2265.31 |
EMC7 | 2513_231_hi | 1689.02 |
GPI | 2513_231_hi | 9238.90 |
PSMB2 | 2513_231_hi | 6233.50 |
PSMB4 | 2513_231_hi | 10926.18 |
RAB7A | 2513_231_hi | 4479.49 |
REEP5 | 2513_231_hi | 2800.87 |
SNRPD3 | 2513_231_hi | 1493.16 |
VCP | 2513_231_hi | 14987.10 |
VPS29 | 2513_231_hi | 1741.01 |
C1orf43 | 2514_231_lo | 14096.89 |
CHMP2A | 2514_231_lo | 3231.16 |
EMC7 | 2514_231_lo | 1917.48 |
GPI | 2514_231_lo | 16378.83 |
PSMB2 | 2514_231_lo | 8740.18 |
PSMB4 | 2514_231_lo | 19297.21 |
RAB7A | 2514_231_lo | 5209.84 |
REEP5 | 2514_231_lo | 4275.58 |
SNRPD3 | 2514_231_lo | 787.39 |
VCP | 2514_231_lo | 19691.58 |
VPS29 | 2514_231_lo | 2658.64 |
#select values greater than 10000
data2_genesHigh <- filter(data2, Value > 10000)
data2_genesHigh %>% select(Gene) %>% distinct()
## # A tibble: 5 × 1
## Gene
## <chr>
## 1 C1orf43
## 2 GPI
## 3 PSMB2
## 4 PSMB4
## 5 VCP
#Spread (opposite of gather)
data2_genesHighSpread <- spread(data2_genesHigh,Sample,Value)
data2_genesHighSpread
## # A tibble: 5 × 19
## Gene `2479_231_lo` `2480_231_hi` `2499_231_lo` `2500_231_hi` `2513_231_hi`
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 C1orf43 12555. NA 10356. NA NA
## 2 GPI 20824. 19720. 11163. 17936. NA
## 3 PSMB2 NA NA NA NA NA
## 4 PSMB4 20726. 18809. 12315. 18363. 10926.
## 5 VCP 25282. 23515. 18529. 23857. 14987.
## # ℹ 13 more variables: `2514_231_lo` <dbl>, N8_hi_1 <dbl>, N8_hi_2 <dbl>,
## # N8_hi_3 <dbl>, N8_hi_4 <dbl>, N8_hi_5 <dbl>, N8_hi_6 <dbl>, N8_lo_1 <dbl>,
## # N8_lo_2 <dbl>, N8_lo_3 <dbl>, N8_lo_4 <dbl>, N8_lo_5 <dbl>, N8_lo_6 <dbl>
#group_by and summarise
data2_genesMean <- data2 %>% group_by(Gene) %>% summarise(average=mean(Value))
data2_genesMean
## # A tibble: 11 × 2
## Gene average
## <chr> <dbl>
## 1 C1orf43 8466.
## 2 CHMP2A 2408.
## 3 EMC7 2427.
## 4 GPI 13500.
## 5 PSMB2 6930.
## 6 PSMB4 14747.
## 7 RAB7A 4290.
## 8 REEP5 3781.
## 9 SNRPD3 1203.
## 10 VCP 15041.
## 11 VPS29 1867.
ggplot2 is a powerful popular graphics package that can be easily used with other tidyverse packages
library(ggplot2)
data2 %>% ggplot(aes(x=factor(Gene),y=Value))+geom_boxplot()+theme(axis.text.x = element_text(angle = 90, hjust = 1))
data2 %>% filter(Sample %in% c("HMLE_1","HMLE_2")) %>% ggplot(aes(x=factor(Gene),y=Value))+geom_point()+facet_grid(. ~ Sample)+theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_line ()
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?