In recent years, non-relational data have attracted more and more attentions. Roughly speaking, all datasets that are hard to put into a rectangular table with rows and columns are non-relational datasets.继续阅读 »
In data-driven statistical computing and data analysis, applying a chain of commands step by step is a common situation. However, it is neither straightforward nor flexible to write a group of deeply nested functions. It is because the function that comes later must be written first.继续阅读 »
In both research and application, we need to manipulate data frames by selecting desired columns, filtering records, transforming and aggregating data.继续阅读 »
People love dealing with well-structured data. It costs much less efforts than working with disorganized raw texts.
In economic and financial research, we typically download data from open-access websites or authentication-required databases. These sources may provide data in multiple formats. For example, almost all 继续阅读 »
Oftentimes, we obtain a long or a wide table from a certain data source, and it may be the only format we can get. For example, some financial databases provide daily tick data for all stocks in a financial market. The data table may be arranged in a long format like this:继续阅读 »
R 语言用的垃圾回收算法是 分代算法, 有一个小优化就是会用 name 字段来实现 copy on write.
当 name 为0时, 没有任何人用它,可以删掉;
当 name 为1时, 正在有表达式在用它,所以复制了一份;
当 name 为2时, 证明有另一个变量指向了它,当修改时要复制一份出来.继续阅读 »