RECENT POSTS
Thursday, March 21, 2019
Tech Data’s Goodwill Adjustment

Tuesday, March 19, 2019
There’s Taxes, and There’s Taxes

Saturday, March 16, 2019
Adventures in Tax Cuts and Net Income

Monday, March 11, 2019
Big Moves in Goodwill, Intangible Value

Friday, March 8, 2019
CVS, Goodwill, and Enterprise Value

Thursday, February 28, 2019
Summary of Our Goodwill Research/ How-To

Wednesday, February 27, 2019
What Does ‘Other’ Mean? An Example

Thursday, February 21, 2019
Another Tale, Buried in the Footnotes

Wednesday, February 13, 2019
Low Latency Calcbench

Monday, February 11, 2019
Now Streaming on Hulu: Red Ink

Thursday, February 7, 2019
Early Look at 2018 Tax Decline

Wednesday, February 6, 2019
You Revised WHAT, Netflix?

Thursday, January 31, 2019
Talking About Huawei Exposure

Wednesday, January 30, 2019
Another Discrepancy in Reported Numbers

Wednesday, January 30, 2019
Finding Revised Facts: Hertz Edition

Wednesday, January 23, 2019
GE Commercial Aviation Services: Bringing Numbers to Light

Monday, January 21, 2019
Differences in Earnings Releases and 10-Ks

Wednesday, January 16, 2019
The Importance of Textual Analysis

Tuesday, January 8, 2019
A Look at Climate Change Disclosures

Wednesday, January 2, 2019
Quants: Point-in-Time Data for Backtesting

Archive  |  Search:
10-K/Q Section Text Change Detection
Tuesday, October 30, 2018

The Code

Goal

Reduce the amount of time analysts spend reading 10-K/Qs by highlighting the sections which change the most between periods.

Hypothesis

The cosine distance between Term Frequency - Inverse Document Frequencey (TF-IDF) vectors of 10-K sections is a useful proxy for semantic change in 10-K sections across time.

Procedure

  1. Use the Calcbench Python API Client to download the Risk Factors section of the 10-K from Calcbench
  2. Tokenize the sections
  3. Build TF-IDF matrices
  4. Compute the cosine distance between each section and the same section from the previous filing/period
  5. Render the matrix of distances with largest distances highlighted.
  6. Review large changes by “diffing” documents with distance above a certain threshold.

Highlight Risk Factors with Greatest Change

Brightest cells are those documents which changed the most vis-a-vis the previous period.
2018 2017 2016 2015 2014 2013 2012 2011 2010 2009
JNJ 0 0.01 0.607 0.02 0.57 0 0.042 0 0.026 0
JPM 0 0.55 0.008 0.008 0.013 0.023 0.013 0.028 0.427 0
WBA 0.007 0.014 0.017 0.187 0 0.063 0.244 0.099 0 0
DWDP 0 0.24 0.135 0.045 0.033 0.02 0.033 0.01 0.068 0
V 0 0.034 0.153 0.066 0.007 0.027 0.021 0.016 0.097 0
MCD 0 0.014 0.019 0.015 0.178 0.045 0.038 0.04 0.051 0
VZ 0 0.012 0.061 0.053 0.049 0.097 0.035 0.029 0.063 0
WMT 0.061 0.056 0.025 0.084 0.03 0.076 0.02 0.024 0 0
PFE 0 0.015 0.074 0.063 0.034 0.02 0.039 0.047 0.044 0
PG 0.026 0.02 0.02 0.069 0.022 0.018 0.102 0.026 0 0
UTX 0 0.022 0.006 0.031 0.009 0.04 0.033 0.072 0.069 0
HD 0 0.037 0.018 0.03 0.12 0.017 0.012 0.02 0.025 0
CVX 0 0.013 0.028 0.045 0.02 0.001 0.005 0.075 0.057 0
INTC 0 0 0.011 0.002 0.068 0.018 0.032 0.094 0.017 0
AXP 0 0.009 0.015 0.032 0.016 0.031 0.025 0.022 0.059 0
KO 0 0.013 0.007 0.014 0.012 0.011 0.037 0.035 0.065 0
UNH 0 0.008 0.006 0.006 0.009 0.023 0.014 0.052 0.038 0
MSFT 0.024 0.013 0.012 0.012 0.031 0.012 0.035 0.017 0 0
BA 0 0.005 0.005 0.002 0.004 0.009 0.032 0.036 0.046 0
MMM 0 0.008 0.008 0.026 0.007 0.009 0.005 0.044 0.026 0
DIS 0 0.003 0.003 0.006 0.026 0.012 0.025 0.035 0.014 0
MRK 0 0.014 0.006 0.019 0.012 0.009 0.012 0.022 0.023 0
GS 0 0.013 0.01 0.015 0.012 0.009 0.006 0.024 0.02 0
CAT 0 0.002 0.009 0.004 0.009 0.011 0.018 0.013 0.04 0
NKE 0.015 0.009 0.005 0.01 0.01 0.023 0.017 0.007 0 0
XOM 0 0.031 0.008 0.018 0.003 0.007 0.012 0.015 0 0
AAPL 0 0.004 0.003 0.001 0.004 0.003 0.027 0.017 0.004 0
TRV 0 0.005 0.006 0.002 0.005 0.012 0.007 0.012 0.016 0
CSCO 0.004 0.002 0.004 0.002 0.006 0.012 0.003 0.01 0 0
IBM 0 0.004 0.004 0.003 0.001 0.011 0.001 0.001 0.005 0

FREE Calcbench Premium
Two Week Trial

Research Financial & Accounting Data Like Never Before. More features and try our Excel add-in. Sign up now to try the Premium Suite.