I’ve moved to WordPress: http://bobtisdale.wordpress.com/

Thursday, April 8, 2010

Converting txt Data Into Columns In EXCEL

I’ve moved to WordPress.  This post can now be found at Converting txt Data Into Columns In EXCEL
####################
You’ve made the decision to investigate climate data, or to verify a claim made by a blogger, or to check the graphs climate bloggers present in their posts. You want to make sure they’re not making incorrect claims or being creative with the data. So, for example, you go to the KNMI Climate Explorer Monthly observations webpage and select HADISST data. On the next page, you input -5 and 5 for latitudes and -170 and -120 for longitudes, which are the coordinates of the NINO3.4 region (5S-5N, 170W-120W) of the tropical Pacific, an ENSO index. Then, on the next page, you scroll down to the third graph “anomalies” and click on “Raw Data” above it. There, you’re presented with the webpage shown in Figure 1, with two columns of data in txt format: months and the monthly NINO3.4 SST anomalies from January 1870 to present.

http://i41.tinypic.com/x59stf.jpg
Figure 1

You copy and paste the data in TXT format into EXCEL, and you run into a stumbling block, Figure 2. Both of the columns of numbers appear in one column in EXCEL. You don’t know what to do, so, frustrated, you close the windows and forget about it.
http://i43.tinypic.com/2m4u2s5.jpg
Figure 2

Or you discover how simple converting the txt Data to Columns is in EXCEL, Figures 3 through 5. Your version of EXCEL may be different, but I believe the “Convert Text To Column Wizard” is present in other versions.
NOTE: The data often includes numbers in Scientific Notation (Example: 0.377E-02). Do not delete any part of the number. EXCEL understands Scientific Notation and will automatically convert it (Convert the example to 0.00377).
http://i42.tinypic.com/51x3px.jpg
Figure 3
##################
http://i41.tinypic.com/2dl966b.jpg
Figure 4
##################
http://i44.tinypic.com/vhqku9.jpg
Figure 5

Yup, it’s that simple.

28 comments:

Dominic said...

Bob

Thanks! I was -so- stuck with this (frankly daft) little problem but very unwilling to trouble you again for more help.

Underway once more...

Dominic

Dominic said...
This comment has been removed by a blog administrator.
Bob Tisdale said...

Dominic: The E-01 in -0.377926E-01 is not a third column. -0.377926E-01 is scientific notation for the number -0.0377926. The "E notation" (E-01) means 10 to the minus 1 power. Make sure the E-01 or E-02, etc., is included in the second column.

Dominic said...

Bob

Ah. Now that explains a lot.
This is proving somewhat more complex than I imagined. All the more reason to plough on.
So, when I partition the single column into two and the number changes from [-0.377926E-01] to [-3.78E-02] I'm still looking at -0.0378. The light dawns.
This is, to be blunt, rather embarrassing, but at least I am finding out what I need to know. Once again, thanks for the guidance. Sorry if it was a head-in-hands moment for you there.

Dominic

Dominic said...

Got it.
Southern Ocean OHC reproduced exactly as in http://i50.tinypic.com/1236qlt.png
Thank you for all your help.

Dominic

Bob Tisdale said...

Dominic: Sorry about deleting your earlier comment but I foresaw someone reading your suggestion to delete the E notation and not going on to read my reply.

Enjoy. I've learned a lot from SST and OHC data.

Regards

Dominic said...

No problem. I shouldn't have posted in haste, as it were. Plus you spared me a few (more) blushes.

Dominic

David Shipley said...

Thanks Bob. I was too embarrassed to ask for help, and now I find it is so simple I am even more embarrassed!

Andrew said...

Thanks for this Bob! I'm sick and tired of going from word to excel. I completely forgot to check if Excel could do this on it's own!

Dominic said...

Bob
Just one final question.
How do you get Excel to use correct date values for the x-axis instead of cell row numbers or its default date series starting in 1900?
I had no problem working out how to generate trend lines and apply a 13-month moving average filter to the curves, but the x-axis issue is proving intractable. Nothing in the help system or that I can find online throws any light on the issue.
I suspect that unless I am being obtuse this will catch out other Excel novices who have got this far.

Many thanks, as ever,

Dominic

Unknown said...

Bob,

Use this all the time but maybe you can help in another way.

A lot of data has the temp info in rows. I've had to cut and paste it into columns. Is there anyway that Excels graphing function can handle this automatically without cut and pasting or using a macro to get the temps in a column.

Also, I'm using Excel to modify a file v2.temperature.inv. However, saving it seems to change something in the file so it will no longer run from a R script.

You seem to manipulate a lot of data so I'm plumbing you for ideas.

Thanks

Unknown said...

Bob,

Use this all the time but maybe you can help in another way.

A lot of data has the temp info in rows. I've had to cut and paste it into columns. Is there anyway that Excels graphing function can handle this automatically without cut and pasting or using a macro to get the temps in a column.

Also, I'm using Excel to modify a file v2.temperature.inv. However, saving it seems to change something in the file so it will no longer run from a R script.

You seem to manipulate a lot of data so I'm plumbing you for ideas.

Thanks

Bob Tisdale said...

Dominic: You asked, "How do you get Excel to use correct date values for the x-axis instead of cell row numbers or its default date series starting in 1900?"

If you're getting your data from KNMI, why not use their numerical months? Why would you want to change them? So that they read Jan-01, Feb-01, etc., in your column?

Or are you asking another question?

Bob Tisdale said...

Steve: In your April 9, 2010 2:06 PM comment, you asked, "Is there anyway that Excels graphing function can handle this automatically without cut and pasting or using a macro to get the temps in a column."

I haven't found one, but I wish there was one.

And I have no idea about the second problem you discussed, sorry. I don't use R. Also I really don't manipulate data, other than smoothing it.

Bob Tisdale said...

Steve: Didn't you post three comments today? I accepted all three, though two were the same, but I'm only seeing two comments now. Sorry, I don't know what happened to the third.

Unknown said...

Thanks Bob,

Sorry about the double post, not use to the comment process.

By the way, what version of Excel are you using? Excel 2007, makes this process really easy. For example, if you go to UAH website:
http://vortex.nsstc.uah.edu/data/msu/t2lt/uahncdc.lt
and copy all the data, you can simply paste that into notepad, go to Excel, open the saved notepad data in Excel and the dialogue box opens automagically.

Some people seem to hate Excel 2007.

My quest for a solution to the row problem continues!!

Bob Tisdale said...

Steve: I'm also using EXCEL 2007. Thanks for the notepad process, but doesn't that create additional steps and an additional saved file?

Unknown said...

Bob

Actually, i didn't count the steps so you are probably right. Just for fun, next time I'll try it your way!! I like to keep the back up original file sometimes.

It just seems odd to me that Microsoft hasn't made it easy to produce graphs using rows. Maybe it's hard to code. It seems like R has no problems with it. Maybe something to do with arrays which I haven't looked into.

Dominic said...

Bob

I must be doing something wrong:
1. Copy date/temperature data from KNMI eg [1955.0000 -0.272808] etc.
2. Paste into Excel 2007. Result is (eg) 661 rows for NODC OHC data.
3. Use Data/Text-to-column to split using fixed width option. Result, two columns with date [1955] etc in A and temp [-0.272808] etc in B.
4. Pick Insert/Charts/Line. Result: the right curve, but ALWAYS with cell row numbers along the x-axis.
I have made no changes to any data from KNMI.
I have tried highlighting column A, then column B, or just having a single cell from either selected. Nothing makes any difference. Always cell row numbers on x-axis.
I'm sure it's simple and obvious, but unfortunately, not to me...

Cheers

Dominic

Bob Tisdale said...

Dominic: Sometimes EXCEL can be obnoxious. Place a "Name" at the top of each column: Month and the Name of the Dataset. Then highlight all cells, including the "name" (better word is probably "header"). Then try "Insert", "Scatter", and then the "Scatter with straight lines". That's what I use.

Dominic said...

Bob

I did as you suggested and things got worse.
The result of adding the headers 'Month' (Column A) and eg 'S Atlantic' (Col B) was startling: The x-axis was scaled 0 0.5 1 1.5 etc.
The y-axis was scaled 0 0.2 0.4 etc.
There was no curve whatsoever, just an empty plot area. We are both using Excel 2007. I am completely baffled...

Dominic said...

Bob

I think I have found the problem.
As I was working through the whole process afresh (to avoid some old error recurring unnoticed) I forgot to delete the comment on the final line of the data set [# repeat last year for a nice gnuplot].
Once removed - success!
It might be worth warning others about this?

Sorry about the fuss. Your ongoing help has been deeply appreciated.

Dominic

magellan said...

To copy rows to a single column, apply this macro for Excel.

Sub RowsToSingleColumn()
Dim Reply As Range
Dim cell As Range
Dim RowCount As Integer
On Error Resume Next
Set Reply = Application.InputBox _
("Select your data", "rowstosinglecolumn", _
Selection.CurrentRegion.Address, Type:=8)
If Reply Is Nothing Then Exit Sub

Application.ScreenUpdating = False
With Reply
.Columns(1).EntireColumn.Insert
.Columns(1).Offset(0, -1).Cells(1, 1) = "Transposed"
RowCount = .Rows.Count

For Each cell In .Rows
cell.Rows(1).Copy
.Columns(1).Offset(0, -1) _
.Range("A65536").End(xlUp).Offset(1, 0) _
.PasteSpecial Transpose:=True

Next
End With
Application.ScreenUpdating = True
End Sub

If you need help executing the macro or creating an .xla, let me know.

Unknown said...

Thanks d

It works!!

Bob

You are probably aware of this but have you seen this site:

http://sio-argo.ucsd.edu/Marine_Atlas.html

They will send you a free CD with all the latest argo data and the program to run it. (contact Megan)

Bob Tisdale said...

Steve: Thanks for the ARGO link. I have been looking at it, considering the pros and cons of being able to investigate the ARGO data. In other words, I'm procrastinating.

Charlie A said...

Bob T. @April 9, 2010 5:09 PM: "If you're getting your data from KNMI, why not use their numerical months? Why would you want to change them? So that they read Jan-01, Feb-01, etc., in your column?"

In playing around with some NSIDC data I found that I needed to be able to look at an individual month's record over many years.

The original data format was text in 2 columns. One column was date like the KNMI, the 2nd column of text was the monthly sea ice extent.

I found it easiest to format the data into an array. One row for each year, 12 columns for months. It turned out that an easy way to go from 2 columns of fraction years + data was to use the INT function to generate columns for just the integer year, and then another column for "month number".

Then I used a pivot table where Excel (or in my case, the data pilot of Open Office SCALC) did all the work of converting the column to the format I wanted.

The annual average is now just the average of the 12 columns of a row. It is also easy to do monthly averages and plots since they are now in their own column.

Pivot tables sounded like they might be difficult to work with, but in fact were quite easy.

Bob Tisdale said...

Charlie A: Agreed. Sometimes we need to change the data format from column to table or vice versa, but the reply you quoted was in response to someone who wanted to change how months appeared on his spreadsheet. At least that's how I read his question. He wanted to change the months (1979.0000, 1979.0834, 1979.1666, etc.), presented by KNMI into a form he was familiar with (Jan-79, Feb-79, Mar-79), and there's no reason to do that if all he was going to do was create a graph. In fact, if you change the months from their numerical form, the x-axis is more difficult to work with.

MikeN said...

Anyone just using Works?
To do the same thing there, do a fin for the first space, then substring out that portion, then again for the left block.
For example, if each entry is Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec temp data,
then
substr(a,find(a,' '), len(a)) will assign to the columns
jan-dec, feb-dec, mar-dec,..dec
Then do
substr(a,0,find(a,' '))
to get jan - dec in their own column.

Not sure if I have the exact function calls,

Donations

Tips are now being accepted.

Comment Policy, SST Posts, and Notes

Comments that are political in nature or that have nothing to do with the post will be deleted.
####
The Smith and Reynolds SST Posts DOES NOT LIST ALL SST POSTS. I stopped using ERSST.v2 data for SST when NOAA deleted it from NOMADS early in 2009.

Please use the search feature in the upper left-hand corner of the page for posts on specific subjects.
####
NOTE: I’ve discovered that some of the links to older posts provide blank pages. While it’s possible to access that post by scrolling through the history, that’s time consuming. There’s a quick fix for the problem, so if you run into an absent post, please advise me. Thanks.
####
If you use the graphs, please cite or link to the address of the blog post or this website.