In my pursuit of understanding asset markets, I’ve maintained a SQL Server database with a lot of information about the stock market and other indexes. Using some data services along with SSIS, this database has been kept current to the point that it now has over 3 billion records in total including 2.3 billion records in an intraday table. Most of my queries and interest concerns cumulative equity and index end-of-day history which is only 75 million row and the options data since 2003 which is now up to 175 million rows.
To be able to query this level of data, I utilize Fusion-io PCIE SSD storage for the SQL Server database. Using self-joins can produce some very interesting analysis. For example, the below query outlines the performance of a few global indexes where there have been large bounces close to market tops and the ensuing performance afterwards. This query complete in just a few seconds. There are couple of tricks that make this run faster – one is the storing of a relative day number to avoid performance issues with working around weekends for querying prior dated history. The day number is sequential across holidays and weekends so a direct link can be done without a range test. The other trick is that the table is partitioned based on the date which allows a good deal of parallelism.
Here is the query
select h.TradingSymbol, h.MarketDate, h.ChgPct + hprev.ChgPct as TwoDayChg, hprev.PriceAtClose as Price,
(h20.PriceAtClose – h.PriceAtClose) / h.PriceAtClose as Chg1Month,
(h60.PriceAtClose – h.PriceAtClose) / h.PriceAtClose as Chg3Month,
(h180.PriceAtClose – h.PriceAtClose) / h.PriceAtClose as Chg9Month,
(h360.PriceAtClose – h.PriceAtClose) / h.PriceAtClose as Chg18Month,
(h540.PriceAtClose – h.PriceAtClose) / h.PriceAtClose as Chng27Month,
((select MAX(hpast.PriceAtClose) from dbo.EquityHistory hpast
where hpast.TradingSymbol = h.TradingSymbol
and hpast.MarketDate < hprev.MarketDate) – hprev.PriceAtClose) / hprev.PriceAtClose
as PctFromTop
from dbo.EquityHistory h
inner join dbo.EquityHistory hprev
on hprev.DayNumber = h.DayNumber – 1
and hprev.TradingSymbol = h.TradingSymbol
inner join dbo.EquityHistory h20
on h20.TradingSymbol = hprev.TradingSymbol
and h20.DayNumber = hprev.DayNumber + 20
inner join dbo.EquityHistory h60
on h60.TradingSymbol = hprev.TradingSymbol
and h60.DayNumber = hprev.DayNumber + 60
inner join dbo.EquityHistory h180
on h180.TradingSymbol = hprev.TradingSymbol
and h180.DayNumber = hprev.DayNumber + 180
inner join dbo.EquityHistory h360
on h360.TradingSymbol = hprev.TradingSymbol
and h360.DayNumber = hprev.DayNumber + 360
inner join dbo.EquityHistory h540
on h540.TradingSymbol = hprev.TradingSymbol
and h540.DayNumber = hprev.DayNumber + 540
where h.TradingSymbol in (‘^dji’,’^dax’,’^ixic’,’^n225′,’^ftse’,’^djt’,’^gspc’,’^rut’,’^ssec’)
and h.ChgPct + hprev.ChgPct > 6.0
and (select MAX(hpast.PriceAtClose) from dbo.EquityHistory hpast
where hpast.TradingSymbol = h.TradingSymbol
and hpast.MarketDate < hprev.MarketDate) between 1.0 * h.PriceAtClose and 1.15 * h.PriceAtClose
order by h.MarketDate
And here are the results formatted in Excel:
This is not a good omen for the stock market for the next couple of years based on history.
There is probably not enough data here to draw conclusions, but since the query is against indexes rather than individual stocks, it does seem pretty convincing. It is pretty certain that a query against the entire equity history for this would yield similar bottom line averages, but that would take several minutes to complete.
Here is the query again as it shows with the SQL markup: