Jump to content
House Price Crash Forum

Large Data Plotting Exercise


anonguest

Recommended Posts

0
HOLA441

HELP!

To all the 'techies'/scientists out there.......

I have a one off 'data plotting' task to do that is beyond my woefully out of date software/maths/science skills.....This is not my job it is purely voluntary and charitable, being done for others.

Without going into boring/confidential details.....I have a very large 2D x-y coordinate data file that is is spreadsheet form (i.e tens of thousands of rows of data, each with x-number of columns of variable values). All the data is numerical.

Attached is a whimsical illustrative example of the data format.

I need to plot and 'extract' certain information from this raw data in certain ways.

I need to produce....

Colour coded 'temperature' style contour map of some of the variables, with varying degrees of sophistication (see the attached crude schematic to illustrate what I mean)

Say, for example, firstly a simple(?) 'map' showing the varying density of people over the entire 2D space.

More advanced(?) data extraction might be to be able to determine/calculate, again purely for illustrative example, the total number of people contained within a specified region (as defined by ranges of x-y coordinates)

There might be need to further filter the data, to compute/calculate the total number of people within a certain region that are over a certain height only. And so on.

Can MS-Excel do this sort of chart plotting??? or most of it??

I haven't used Excel since Office2000, and have just now upgraded to Office2010 specifically to be able to handle the >100,000 rows of data. It looks horrible!

Also, someone suggested to me a specialist scientific chart plotting software called Sigmaplot, and which can offer much more sophisticated chart plotting capabilities. I have just downloaded a free trial version.

BUT, before I proceed any further, can anyone advise/reassure if I am going in the right direction? using the right tools for the job? etc. OR will neither of these software tools do the job? Which is the easiest way to go about doing this job.

Thanks to all.

contour map.jpg

data format.jpg

post-7446-0-75722500-1395482348_thumb.jpg

post-7446-0-50041000-1395482787_thumb.jpg

Link to comment
Share on other sites

1
HOLA442
2
HOLA443

Is THIS what you have in mind ?

Depends upon your ability and willingness to programme - you asked for technical, so....

However, MATLAB requires a license, and it's not cheap for a 'one-off'

R is free and very similar to Matlab. R-Project website - it is used by many scientists just as they also use Matlab

Lots of online tutorials - contour plots are quite straightforward.

Do you need statistical analysis as well ?

Link to comment
Share on other sites

3
HOLA444
4
HOLA445
5
HOLA446

Thanks all, for replies so far. I will start to 'digest' the suggestions.

I don't really want to do (or am sure if I can!) any really serious 'maths'/programming.

I was just hoping there is some plotting tool (now looks like Excel is wayyyy to simple for the task) that can be used, and that has at least some of the typical 'functions' required built in as standard.

I don't need to do statistical 'analysis' beyond the information will be revealed by the type of plots required - but, as mentioned, will want to 'extract' some numerical data values for various 'regions' (e.g. total number of people, either all or of various types, located within some defined area)

above all the software used should be free (at least for a short period).

Hope that clarifies/helps

Link to comment
Share on other sites

6
HOLA447

Is THIS what you have in mind ?

Depends upon your ability and willingness to programme - you asked for technical, so....

However, MATLAB requires a license, and it's not cheap for a 'one-off'

R is free and very similar to Matlab. R-Project website - it is used by many scientists just as they also use Matlab

Lots of online tutorials - contour plots are quite straightforward.

Do you need statistical analysis as well ?

Yes. That appears to be pretty much what I had in mind (at least for the 'basic' plotting). Although the plots appear to be only isoline plots? I really ideally want want gradient colour plots (i.e seamlessly smooth transition of colour from one value level to the next)

Link to comment
Share on other sites

7
HOLA448

Thanks all, for replies so far. I will start to 'digest' the suggestions.

I don't really want to do (or am sure if I can!) any really serious 'maths'/programming.

I was just hoping there is some plotting tool (now looks like Excel is wayyyy to simple for the task) that can be used, and that has at least some of the typical 'functions' required built in as standard.

I don't need to do statistical 'analysis' beyond the information will be revealed by the type of plots required - but, as mentioned, will want to 'extract' some numerical data values for various 'regions' (e.g. total number of people, either all or of various types, located within some defined area)

above all the software used should be free (at least for a short period).

Hope that clarifies/helps

You will achieve your objective in R and an appropriate 'Google query' will probably find you the few sentences of code you require already written in an online tutorial. All depends upon your starting level of expertise though.

Link to comment
Share on other sites

8
HOLA449

You will achieve your objective in R and an appropriate 'Google query' will probably find you the few sentences of code you require already written in an online tutorial. All depends upon your starting level of expertise though.

Thanks. I will start to check out this R thing you mention......

Link to comment
Share on other sites

9
HOLA4410

Thanks. I will start to check out this R thing you mention......

Errrrr......scrub that. I've just looked at the Wiki page, and it looks terrifying!

I think I really want (need!) some reasonably dedicated charting software. Will pay IF I have to, though making cheeky use of a months free trial is do-able?

Link to comment
Share on other sites

10
HOLA4411
11
HOLA4412

Personally, I would load it into a SQL Database (Microsoft have a free express edition). I would probably try to use the reporting features in SQL Server to create the heatmap, or failing that write some code in .net to do it but I would write some SQL to do most of the work.

https://www.microsof...er-express.aspx

If you have some experience of SQL it would be doable/you could get upto speed relatively quickly (and for free) as its relativity quick to install and get started.

The other advantage with using SQL Server is it supports geometry, ie the ability to define points and write queries that are aware of geometry/calculate if points are within a shape, aggregate and subtract shapes which is probably something your looking to do.

http://technet.micro...y/bb964711.aspx

Having said that... start simple, load the data and see if you can write some Simple SQL to process it in the way you need.

Link to comment
Share on other sites

12
HOLA4413
13
HOLA4414

HELP!

To all the 'techies'/scientists out there.......

I have a one off 'data plotting' task to do that is beyond my woefully out of date software/maths/science skills.....This is not my job it is purely voluntary and charitable, being done for others.

Without going into boring/confidential details.....I have a very large 2D x-y coordinate data file that is is spreadsheet form (i.e tens of thousands of rows of data, each with x-number of columns of variable values). All the data is numerical.

Attached is a whimsical illustrative example of the data format.

I need to plot and 'extract' certain information from this raw data in certain ways.

I need to produce....

Colour coded 'temperature' style contour map of some of the variables, with varying degrees of sophistication (see the attached crude schematic to illustrate what I mean)

Say, for example, firstly a simple(?) 'map' showing the varying density of people over the entire 2D space.

More advanced(?) data extraction might be to be able to determine/calculate, again purely for illustrative example, the total number of people contained within a specified region (as defined by ranges of x-y coordinates)

There might be need to further filter the data, to compute/calculate the total number of people within a certain region that are over a certain height only. And so on.

Can MS-Excel do this sort of chart plotting??? or most of it??

I haven't used Excel since Office2000, and have just now upgraded to Office2010 specifically to be able to handle the >100,000 rows of data. It looks horrible!

Also, someone suggested to me a specialist scientific chart plotting software called Sigmaplot, and which can offer much more sophisticated chart plotting capabilities. I have just downloaded a free trial version.

BUT, before I proceed any further, can anyone advise/reassure if I am going in the right direction? using the right tools for the job? etc. OR will neither of these software tools do the job? Which is the easiest way to go about doing this job.

Thanks to all.

I concur with Matlab, or Mathematica. I'm Matlab biased and use Excel purely for holding data to import.

Creating usable and pretty plots is a bit of an art form, it will take time to make something user friendly.

There are decent freebee's out there too;

Octave

Freemat

Scilab

All have steep learning curves if you have no programming background. Easy if you've done C++ or similar.

Link to comment
Share on other sites

14
HOLA4415

Is THIS what you have in mind ?

Depends upon your ability and willingness to programme - you asked for technical, so....

However, MATLAB requires a license, and it's not cheap for a 'one-off'

R is free and very similar to Matlab. R-Project website - it is used by many scientists just as they also use Matlab

Lots of online tutorials - contour plots are quite straightforward.

Do you need statistical analysis as well ?

This is the answer I'd vote for. My initial response was Matlab, but if you haven't got that then R is good for a free alternative.

Link to comment
Share on other sites

15
HOLA4416

I concur with Matlab, or Mathematica. I'm Matlab biased and use Excel purely for holding data to import.

Creating usable and pretty plots is a bit of an art form, it will take time to make something user friendly.

There are decent freebee's out there too;

Octave

Freemat

Scilab

All have steep learning curves if you have no programming background. Easy if you've done C++ or similar.

I am now beginning to realise this - and I appear to have volunteered for something more 'meaty' than I thought. I had merely assumed that the sort of plot I described would be fairly commonplace, and hence fairly straightforward to do? with at least some modern user-friendly (i.e no programming required) software tools available for these purposes. All of those mentioned are beyond me at the moment.

I have just started to look at Sigmaplot12. It appears to offer a means to produce contour plots of the type I described, but even that is involving a bit of ploughing through the help documentation (which I have to say is pretty poor, lacking examples)

Don't misunderstand me. I admire the skill/knowledge required by all you chaps, and will perhaps when I have the time kickstart my brain cells into tackling this sort of thing using some of the tools you have described. But, right now, I am still at the stage where I am getting my head round the fact that I have to first 'prepare' the raw data by overlaying some sort of 'grid'? Never mind what sort of 'interpolation' to use!

contour plot steps.jpg

post-7446-0-30275800-1395561108_thumb.jpg

Link to comment
Share on other sites

16
HOLA4417

If you have the x and y fixed how you want already in Excel, a ghetto contour (without actual contour lines) map can be made by selecting the whole area, run conditional formatting to colour gradient the values and then zoom out a lot. I've done this in the past for quick exploratory data analysis to hone in on an interesting region.

Also I would have a look at the free 30 day trial of Minitab as an option. There are many ways of achieving the same thing (and using R, Matlab, Python, STATA, SAS etc. are probably the expert 'most flexible' options) but you might get something usable quickly with a more GUI based thing like Minitab.

Link to comment
Share on other sites

17
HOLA4418

If you have the x and y fixed how you want already in Excel, a ghetto contour (without actual contour lines) map can be made by selecting the whole area, run conditional formatting to colour gradient the values and then zoom out a lot. I've done this in the past for quick exploratory data analysis to hone in on an interesting region.

Also I would have a look at the free 30 day trial of Minitab as an option. There are many ways of achieving the same thing (and using R, Matlab, Python, STATA, SAS etc. are probably the expert 'most flexible' options) but you might get something usable quickly with a more GUI based thing like Minitab.

"ghetto contour" ????

Link to comment
Share on other sites

18
HOLA4419
19
HOLA4420

Thanks all for the suggestions/info. I will be going through it in due course.

For now though....I have to admit being amazed at how much more 'complicated' it is than I originally thought, at making a contour (isoline) plot from a simple three column data file! I never thought I would have to get bogged down in code and stuff......

Link to comment
Share on other sites

20
HOLA4421

HELP!

To all the 'techies'/scientists out there.......

I have a one off 'data plotting' task to do that is beyond my woefully out of date software/maths/science skills.....This is not my job it is purely voluntary and charitable, being done for others.

Without going into boring/confidential details.....I have a very large 2D x-y coordinate data file that is is spreadsheet form (i.e tens of thousands of rows of data, each with x-number of columns of variable values). All the data is numerical.

Attached is a whimsical illustrative example of the data format.

I need to plot and 'extract' certain information from this raw data in certain ways.

I need to produce....

Colour coded 'temperature' style contour map of some of the variables, with varying degrees of sophistication (see the attached crude schematic to illustrate what I mean)

Say, for example, firstly a simple(?) 'map' showing the varying density of people over the entire 2D space.

More advanced(?) data extraction might be to be able to determine/calculate, again purely for illustrative example, the total number of people contained within a specified region (as defined by ranges of x-y coordinates)

There might be need to further filter the data, to compute/calculate the total number of people within a certain region that are over a certain height only. And so on.

Can MS-Excel do this sort of chart plotting??? or most of it??

I haven't used Excel since Office2000, and have just now upgraded to Office2010 specifically to be able to handle the >100,000 rows of data. It looks horrible!

Also, someone suggested to me a specialist scientific chart plotting software called Sigmaplot, and which can offer much more sophisticated chart plotting capabilities. I have just downloaded a free trial version.

BUT, before I proceed any further, can anyone advise/reassure if I am going in the right direction? using the right tools for the job? etc. OR will neither of these software tools do the job? Which is the easiest way to go about doing this job.

Thanks to all.

ROOT will do this for you (though there may be better suggestions from others, ROOT certainly has its issues...). Examples of what you can plot:

http://root.cern.ch/drupal/category/image-galleries/data-analysis-visualization

You do need to be able to write very simple C++ though, but its free.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.




×
×
  • Create New...

Important Information