分享 A principle of writing robust R program

2014-02-15 Kun Ren

r

Writing R code can be very easy. It depends on how much you want to achieve with your code and what features you want your code to support. 继续阅读 »

分享 rlist: a new package for working with list objects in R

2014-06-26 Kun Ren

r list rlist package

In recent years, non-relational data have attracted more and more attentions. Roughly speaking, all datasets that are hard to put into a rectangular table with rows and columns are non-relational datasets. 继续阅读 »

分享 Use pipeline operators in R

2014-04-08 Kun Ren

r pipeline pipeR

In data-driven statistical computing and data analysis, applying a chain of commands step by step is a common situation. However, it is neither straightforward nor flexible to write a group of deeply nested functions. It is because the function that comes later must be written first. 继续阅读 »

分享 Use SQL to operate R data frames

2014-02-07 Kun Ren

r data sql

In both research and application, we need to manipulate data frames by selecting desired columns, filtering records, transforming and aggregating data. 继续阅读 »

分享 Extract information from texts with regular expressions in R

2014-02-20 Kun Ren

r regular expression

People love dealing with well-structured data. It costs much less efforts than working with disorganized raw texts. In economic and financial research, we typically download data from open-access websites or authentication-required databases. These sources may provide data in multiple formats. For example, almost all 继续阅读 »

分享 Reshape R data frame from long to wide format

2014-03-14 Kun Ren

r reshape data

Oftentimes, we obtain a long or a wide table from a certain data source, and it may be the only format we can get. For example, some financial databases provide daily tick data for all stocks in a financial market. The data table may be arranged in a long format like this: 继续阅读 »

分享 R 语言垃圾回收(gc算法)

2017-09-30 YongHao Hu

DNS

R 语言用的垃圾回收算法是分代算法, 有一个小优化就是会用 name 字段来实现 copy on write. 当 name 为0时, 没有任何人用它,可以删掉; 当 name 为1时, 正在有表达式在用它,所以复制了一份; 当 name 为2时, 证明有另一个变量指向了它,当修改时要复制一份出来. 继续阅读 »

分享 生物信息软件安装[2]

2014-11-05 summer

centos biosoft

在上一篇《生物信息软件安装[1]》中我们已经介绍了R语言的安装，不过我们在使用R的过程中，经常会遇到使用R的其他模块，比如需要使用ggplot2包，所以我们就需要在R安装的基础上安装ggplot2包。安装完成R之后，进入R的控制台,按照以下命令进行操作即可：继续阅读 »

分享 初等数论四大定理

2018-04-11 Vaniot

数论，数学

费马小定理(Fermat's little theorem) 费马小定理：假假如a是整数，p是质数，且a,p互质(即两者只有一个公约数1)，那么a的(p-1)次方除以p的余数恒等于1。即$a^{p}\equiv a{\pmod {p}}$ more 证明：法一: gcd（a，p）=1,将$1\cdot a,2\cdot a,....(p-1)\cdot a$共(p-1)个数，将他们分别除以p，余数分别为$r_{1},r_{2}......r_{p-1}$,则集合{$r_{1},r_{2}......r_{p-1}$}为{1，2，3...（p-1)}的重排列，即1,2,3,....,(p-1)在余数中恰好各出现一次，（对于继续阅读 »

分享 奇异值分解（SVD）介绍

2015-07-26 王财勇

机器学习，数学算法

本文是笔者在阅读众多资料，包括网上资料、教科书的基础上，编写而成。其基本写作框架是： 1.从数学的角度，对奇异值分解做更加准确的描述，包括定义和性质； 2.matlab的奇异值分解函数简介； more 数学上的SVD 我们阐述关于SVD的定义。【定义】令$A\in R^{m\times n}$,则存在正交矩阵 $U\in R^{m\times m}$, $V\in R^{n\times n}$使得： $$ A=U\Sigma V$$，其中$$\Sigma = diag(\Sigma_1,O) \in R^{m\times n}$$且 $\Sigma_1=diag(\sigma_1,\sigma_2,...,\sigma_r) 继续阅读 »