As a rule, each column in Excel (labeled with the letters) will represent a variable of interest. Usually, my column A is date, column B is time (if I’m exploring intraday data), column C is open price, column D is high price, column E is low price, and column F is closing price. Column G might be devoted to volume data for each of those periods (if that’s part of what I’m investigating); columns H and above will be devoted to other variables of interest, such as the data series for another index or stock or the readings of a market indicator for that period. Each row of data is a time period, such as a day. Generally, my data are organized so that the earliest data are in row 2 and the later data fall underneath. I save row 1 for data labels, so that each column is labeled clearly: DATE, OPEN, HIGH, LOW, CLOSE, etc. You’ll see why this labeling is helpful when we get to the process of sorting the data.
Here are some simple statistical functions that I use frequently to examine data in a qualitative way. Each example assumes that we’re investigating the data in column C, from cells 1 through 10:• =median(C2:C11) - The median value for the data in the formula.
• =max(C2:C11) - The largest value for the data in the formula.
• =min(C2:C11) - The smallest value for the data in the formula.
• =stdev(C2:C11) - The standard deviation for the data in the formula.
• =correl(C2:C11,D2:D11) - The correlation between the data in columns C and D, cells 1-10.
Much of the time, our analyses won’t be of the raw data, but will be of the changes in the data from period to period. The formula =(C3-C2) gives the difference from cell C2 to cell C3. If we want to express this difference as a percentage (so that we’re analyzing percent price changes from period to period), the formula would read = ((C3-C2)/C2)*100. This takes the difference of cells C3 and C2 as a proportion of the initial value (C2), multiplied by 100 to give a percentage.
When we want to update later cells with the percentage difference information, we don’t need to rewrite the formulas. Instead, as noted above, we click on the cell with the formula, click on the Excel menu item Edit, click on copy, then left click the cell below the one with the formula and drag down as far as we want the data. The spreadsheet will calculate price changes for each of the time periods that you selected by dragging. This means that if you save your formulas into worksheets, updating your data is as simple as downloading the fresh data from your vendor, pasting into the appropriate cells in your sheets, and copying the data from formulas for the cells representing the new data period. Once you’ve organized your sheets in this manner, it thus only takes a few minutes a day to completely update.
Once you create a spreadsheet with the appropriate formulas, updating your analyses is mostly a matter of pasting and copying. As a result, you can update many analyses in just a few minutes.
Once again, the basic formulas, arrangement of rows and columns, and copying of data will take some practice before you move on to actual analyses. I strongly encourage you to become proficient with downloading your data from your vendor/application and manipulating the data in Excel with copying, pasting, and formula writing before moving on. Once you have these skills, you’ll have them for life, and they will greatly aid your ability to generate promising trading hypotheses.
COACHING CUE
Trading platforms that support Dynamic Data Exchange (DDE) enable you to link spreadsheets to the platform’s data servers, so that the spreadsheets will populate in real time. This is helpful for tracking indicators as you trade, and it can also be a time-efficient way to archive data of interest. See Rennie Yang’s segment in Chapter 9 for an illustration of the use of DDE.
LESSON 94: VISUALIZE YOUR DATA
One of the best ways to explore data for possible relationships is to actually see the data for yourself. You can create simple charts in Excel that will enable you to see how two variables are related over time, identifying possible patterns that you might not have noticed from the spreadsheet rows and columns. For instance, when charting an indicator against a market average, you may notice divergence patterns that precede changes in market direction. Should you notice such patterns frequently, they might form the basis for worthwhile historical explorations.
Again, a basic introductory text for Excel users will cover the details of creating different kinds of charts, from column charts to line graphs to pie charts. You’ll also learn about the nuances of changing the colors on a chart, altering the graphics, and labeling the various lines and axes. In this lesson, I’ll walk you through a few basics that will get you started in your data exploration.
Many times, you can identify potential trading hypotheses by seeing relationships among data elements.
A simple chart to begin with will have dates in column A, price data in column B, and a second set of price data in column C (see Table 10.1). This chart is helpful when you want to visualize how movements in the first trading instrument are related to movements in the second. For basic practice, here are some hypothetical data to type into Excel, with the data labels in the first row. Column A has the dates, column B contains closing prices for a market index, and column C has the closing prices for a mining stock.
To create the chart, highlight the data with your mouse, including the data labels, and click on the Excel menu item for Insert. You’ll select Chart and a menu of different kinds of charts will appear. You’ll click on Line and select the chart option at the top left in the submenu. That is a simple line chart. Then click Next, and you will see a small picture of your chart, the range of your data, and whether the series are in rows or columns. Your selection should be columns, because that is how you have your variables separated. Click Next again and you will see Step 3 of 4 in the Chart Wizard, allowing you to type in a chart title and labels for the X and Y axes. Go ahead and type in Market Index and Mining Stock for the title, Date for the Category (X) axis label, and Price for the Value (Y) axis label. Then click Next.
TABLE 10.1 Sample Market Data
The Step 4 of 4 screen will ask you if you want the chart as an object in your spreadsheet, or if you want the chart to be on a separate sheet. Go ahead and select the option for “As new sheet.” Then click Finish.
What you’ll see is that the Wizard has recognized the date information from column A and placed it on the X-axis. The Wizard has also given us a single Y-axis and scaled it according to the high and low values in the data. Unfortunately, this leaves us unable to see much of the ups and downs in the mining stock data, since the price of the stock is much smaller than the price of the index.
To correct this problem, point your cursor at the line on the chart for your Market Index and right click. A menu will pop up, and you will select the option for Format Data Series. Click the tab for Axes and then click on the button for “Plot Series on Secondary Axis.” When you do that, the picture of the chart underneath the buttons will change, and you’ll notice now that you have two Y-axes: one for the Market Index price data and one for the Mining Stock data. You’ll be able to see their relative ups and downs much more clearly. Click on OK and you will see your new chart. If you’d like the Y-axes to have new labels, you can place your cursor on the center of the chart (away from the lines for the data) and right click. A menu will pop up, and you’ll select Chart Options. That will give you a screen enabling you to type in new labels for the Value (Y) axis (at left) and the Second value (Y) axis (at right).
If you right click on either of the two lines in the chart and, from the pop-up menu select Format Data Series, you’ll see a tab for Patterns. You can click the arrow beside the option for Weight and make the line thicker. You can click the arrow beside the option for Color and change the color of the line.
If you right-click on the X- or Y-axes, you’ll get a pop-up menu; click on Format Axis. If you select the tab for Font, you can choose the typeface, font style, and size of the print for the axis labels. If you select the tab for Scale, you can change the range of values for the axis. With a little practice, you c
an customize the look of your charts.
So what does your chart tell you? You can see that the Mining Stock is not moving in unison with the Market Index. When the index shows large rises or declines, the stock is tending to move in the opposite direction. By itself, over such a short period, that won’t tell you anything you’d want to hang your hat on, but it does raise interesting questions:• Why is the mining stock moving opposite to the market index? Might the mining stock be moving in unison with the gold market instead?
• If the mining stock is moving with gold, is gold also moving opposite to the market index? If so, why might that be? Might there be a common influence on both of them: the strength of the U.S. dollar?
• Does this relationship occur over intraday time frames? Might we be able to identify some buy or sell signals in the mining stock when we see selling or buying in the broad market?
Reviewing charts that you create helps you see intermarket and intramarket relationships.
Many times, investigating relationships through charts leads you to worthwhile questions, which may then lead you to interesting and profitable trading ideas. The key is asking “Why?” What might be responsible for the relationship I am observing? Remember, in your own self-coaching, you want to be generating hypotheses, and there is no better way than plain old brainstorming. When you can actually see how the data are related to each other in graphical form, it is easier to accomplish that brainstorming. You won’t arrive at hard and fast conclusions, but you’ll be on your way toward generating promising trading ideas.
COACHING CUE
Plot charts of the S&P 500 Index (SPY) against the major sector ETFs from the S&P 500 universe as a great way to observe leading and lagging sectors, as well as divergences at market highs and lows. The sectors I follow most closely are: XLB (Materials); XLI (Industrials); XLY (Consumer Discretionary); XLP (Consumer Staples); XLE (Energy); XLF (Financial); XLV (Health Care); and XLK (Technology). If you want to bypass such charting, you can view excellent sector-related indicators and charts at the Decision Point site (www.decisionpoint.com). Another excellent site for stock and sector charts is Barchart (www.barchart.com).
LESSON 95: CREATE YOUR INDEPENDENT AND DEPENDENT VARIABLES
When I organize my spreadsheets, I generally place my raw data furthest to the left (columns A, B, C, etc.); transformations of the raw data into independent variables in the middle; and dependent variables furthest to the right. Let’s take a look at what this means.
Your independent variables are what we might call candidate predictors. They are variables that we think have an effect on the markets we’re trading. For example, let’s say that we’re investigating the impact of price change over the previous day of trading (independent variable) on the next day’s return for the S&P 500 Index (dependent variable). The raw data would consist of price data for the S&P 500 Index over the look-back period that we select. The independent variable would be a moving calculation of the prior day’s return. The dependent variable would be a calculation of the return over the next day. The independent variable is what we think might give us a trading edge; the dependent variable is what we would be trading to exploit that edge.
If I keep my raw data to the left in the spreadsheet, followed by transformations of the raw data to form the independent variable, and then followed by the dependent variable, I keep analyses clear from spreadsheet to spreadsheet.
Let’s set that up as an exercise. We’ll download data for the S&P 500 Index (cash close) for the past 1,000 trading days. That information will give us roughly four years of daily data. If I obtain the data from Pinnacle Data, I’ll open a blank sheet in Excel; click on the Excel menu item for File; click Open; go to the Data folder in the C drive; double-click on the IDXDATA folder; select All Files as the Files of type; and double-click the S&P 500 file. I’ll highlight the cells for the past 1,000 sessions; click on the Edit menu item in Excel; click on Copy; open a new, blank spreadsheet; then click the Edit menu item again; and click Paste, with the cursor highlighting the A2 cell. The data from the Pinnacle sheet will appear in my worksheet, leaving row A for data labels (Date, Open, High, Low, Close).
If you download your data from another source, your menu items to access the data will differ, but the result will be the same: you’ll copy the data from your source and paste them into the blank spreadsheet at cell A2, then create your data labels. As a result, your raw data will occupy columns A-E. (Column A will be Date; column B will be Open; column C will be High; column D will be Low; and column E will be Close.) Now, for the data label for column F (cell F1), you can type (without quotation marks): “SP(1).” This is your independent variable, the current day’s rate of change in the index. Your first entry will go into cell F3 and will be (again without quotations marks): “=((E3-E2)/E2)*100).” This is the percentage change in the index from the close of the session at A2 to the close of the session at A3.
Now let’s create our dependent variable in cell G7, with column G labeled SP+1 at G1. Your formula for cell G3 will be “=((E4-E3)/E3)*100).” This represents the next day’s percentage return for the index.
To complete your sheet, you would click and highlight the formula cells at F3 and G3; click on the Excel menu item for Edit; and select the option for Copy. You’ll see the F3 and G3 cells specially highlighted. Then, with your cursor highlighting cells F4 and G4, drag your mouse down the full length of the data set and release, highlighting all those cells. Click again on the Excel menu item for Edit, then select Paste. Your spreadsheet will calculate the formulas for each of the cells and the data portion of your spreadsheet will be finished. The raw data will be in columns A-E. The independent variable (our candidate predictor) will be in column F; and our variable of trading interest—the dependent variable—will reside in column G. Save this spreadsheet as Practice Sheet in an Excel folder. We’ll be using it for future lessons.
Note that we downloaded 1,000 days worth of data, but the actual number of data points in our sample is 998. We could not compute SP(1) from the first data point because we didn’t have the prior day’s close; hence we had to begin our formula in the third data row. We also could not compute SP+1 from the last data point because we don’t know tomorrow’s closing price. Thus our analyses can only use 998 of the data points of the 1,000 that we downloaded. If you want an even 1,000 data points, you’d have to download the last 1,002 values.
With a bit of practice, all of this will become second nature. It will take only a minute or two to open your data files, copy and paste the raw data, write your formulas, and copy the cells to complete your sheet. In this example, we are exploring how the prior day’s return is related to the next day’s return. We’re setting the spreadsheet up to ask the question, “Does it make sense to buy after an up day/sell after a down day; does it make sense to sell after an up day/buy after a down day; or does it make no apparent difference?” I call the independent variable the candidate predictor, because we don’t really know if it is related to our variable of interest. It’s also only a candidate because we’re not conducting the statistical significance tests that would tell us more conclusively that this is a significant predictor. Rather, we’re using the analysis much as we used the charting in the prior lesson: as a way to generate hypotheses.
Remember, in the current examples, we’re using historical relationship to describe patterns in markets, not to statistically analyze them. We’re generating, not testing, hypotheses.
If I had been interested in examining the relationship between the prior week’s price change with the next week’s return, the spreadsheet would look very similar, except the raw data would consist of weekly index data, rather than daily. In general, it’s neatest for analysis if you are investigating the impact of the prior period’s data on the next period. This ensures that all observations are independent; there are no overlapping data.
To see what I mean, consider investigating the relationship of the prior week’s (five-day) price change on the
price change over the next five trading days utilizing daily market data. Your independent variable in column F would now look like “=((E7-E2)/E2)*100)”—price change over the past five days. The dependent variable in column G would be written as “=((E12-E7)/E7)*100”: the next five-day’s price change. Note, however, that as you copy those cells down the spreadsheet per the above procedure, that each observation at cells F8, F9, F10, and so on and G8, G9, G10, and so on, is not completely independent. The prior five-day return overlaps the values for F8, F9, and F10, and the prospective five-day return overlaps for cells G8, G9, and G10. This will always be the case when you’re using a smaller time period for your raw data than the period that you’re investigating for your independent and dependent variables.
Inferential statistical tests depend on each observation in the data set being independent, so it is not appropriate to include overlapping data when calculating statistical significance. For my purpose of hypothesis generation, I am willing to tolerate a degree of overlap, and so will use daily data to investigate relationships of up to 20 days in duration—particularly if the amount of overlap relative to the size of the entire data set is small. I would not, say, investigate the next 200 days’ return using daily data for a sample of 1,000 trading days. I wouldn’t have a particular problem using the daily data to investigate, for instance, the prior five-day price change on the next five-day return with a four-year look-back period.
The Daily Trading Coach Page 43