Jump to content
House Price Crash Forum

Archived

This topic is now archived and is closed to further replies.

Fully Detached

Any Excel / Sql Gurus Here?

Recommended Posts

Just wondering if anyone can help me out with a query. I need to compare table data between two tables, which I would normally do with a simple vlookup in Excel. The problem this time is that there is a key field which may contain a different value in each system, but the other values in that row must be a match.

So, for example:

Each table contains a list of fruit pickers. Each picker can have an unlimited number of different colour baskets, and into those baskets he can place a variable number of apples, oranges, strawberries etc - up to 10 different fruits in total.

What I need to ensure is that both systems contain the same fruit picker, and a basket of any colour basket, but there has to be a match so that each basket in each system has a matching - albeit different coloured basket, containing the same numbers of each fruit.

So normally I would use excel and concatenate the picker name, basket colour, and the different fruits and their numbers, and simply do a vlookup on the string.

So, an example comparison that I want to do is this:

Table: A:

Picker: Dave

Blue Basket: 10 apples, 5 oranges, 3 bananas

Red Basket: 3 apples, 2 mangoes, 3 kiwis and 1 grapefruit

Yellow Basket: 1 strawberry

Table B:

Picker: Dave - good, this is a match, as it should be

Purple Basket: 10 apples, 5 oranges, 3 bananas = match (we ignore the colour)

Blue Basket: 3 apples, 2 mangoes, 3 kiwis and 1 grapefruit = (we ignore the colour)

Pink Basket: 3 strawberries = not a match, because the number of berries differs, the colour is irrelevant

I hope that makes sense. If it is any easier, the baskets I am talking about are basically object instances which have different names in different systems. Their names are unimportant, but their contents are.

Many thanks for any help, and apologies for the fruity analogy.

Share this post


Link to post
Share on other sites

I've just realised I've given a very complex explanation of something which could be more simply put as:

Fruit Picker = an object, who's names have to match

Basket = an instance of the object as a container, who' names do not have to match, and in fact will not because the instance names are assigned in each system.

Fruit 1 = a field within this object / instance, where both field name and values have to match

Fruit 2 = a field within this object / instance, where both field name and values have to match

Fruit 3 = as above, etc etc

I'm guessing it might be easier to do this in a SQL query - I am a little rusty with that, but the time this is going to save me if someone can point me in the direction will make it worth dusting off some books and re-learning a bit.

Share this post


Link to post
Share on other sites

Dunno how new you are to SQL, but have you used joins before?

Are the totals that you mention in the first post (e.g. 10 apples) stored directly in the database or are they something you need to SUM or aggregate in some way to calculate?

Share this post


Link to post
Share on other sites

Hey Joe,

I've used simple joins before, so I could probably learn something slightly more complicated.

All the values that I need to query are stored in the same table, so in a simple situation where the colour of the basket also had to match, I would just cat the field values like this:

Picker:Barry&Basket:Blue&Apples:4&Bananas:3

I could do a simple vlookup on that string against the concatenated values from the other table, if the basket colour was not different between the two tables. But because it is, I need to do something a bit smarter, and this is where I is stuck ;-)

Share this post


Link to post
Share on other sites

Something with INDEX and MATCH should work, without the table in front of me I'm useless with Excel question though!

Share this post


Link to post
Share on other sites

Be helpful to show the table columns and datatypes.

Datatypes is not really important - for the sake of this, consider them all as strings.

So for the table columns, we have:

Picker Name

Basket Colour

Basket Item 1 Name

Basket Item 1 Value

Basket Item 2 Name

Basket Item 2 Value

etc

Hope that helps.

In this case the only

Share this post


Link to post
Share on other sites

So two objects are equal, regardless of instance name, if they are of the same type (class name) and all thier properties match in terms of name and value?

Are the properties of one allowed to be a subset of the other's or must they all match one for one? Is the order of properties important?

Share this post


Link to post
Share on other sites

Can't you sort by picker in both tables and then by basket 1, basket 2 etc? Eyeball the data to see where you might automate a foolproof query. Or you could add another index or hash key based on the numbers of fruits in the respective baskets.

Or not.

Share this post


Link to post
Share on other sites

Just wondering if anyone can help me out with a query. I need to compare table data between two tables, which I would normally do with a simple vlookup in Excel. The problem this time is that there is a key field which may contain a different value in each system, but the other values in that row must be a match.

So, for example:

Each table contains a list of fruit pickers. Each picker can have an unlimited number of different colour baskets, and into those baskets he can place a variable number of apples, oranges, strawberries etc - up to 10 different fruits in total.

What I need to ensure is that both systems contain the same fruit picker, and a basket of any colour basket, but there has to be a match so that each basket in each system has a matching - albeit different coloured basket, containing the same numbers of each fruit.

So normally I would use excel and concatenate the picker name, basket colour, and the different fruits and their numbers, and simply do a vlookup on the string.

So, an example comparison that I want to do is this:

Table: A:

Picker: Dave

Blue Basket: 10 apples, 5 oranges, 3 bananas

Red Basket: 3 apples, 2 mangoes, 3 kiwis and 1 grapefruit

Yellow Basket: 1 strawberry

Table B:

Picker: Dave - good, this is a match, as it should be

Purple Basket: 10 apples, 5 oranges, 3 bananas = match (we ignore the colour)

Blue Basket: 3 apples, 2 mangoes, 3 kiwis and 1 grapefruit = (we ignore the colour)

Pink Basket: 3 strawberries = not a match, because the number of berries differs, the colour is irrelevant

I hope that makes sense. If it is any easier, the baskets I am talking about are basically object instances which have different names in different systems. Their names are unimportant, but their contents are.

Many thanks for any help, and apologies for the fruity analogy.

I assume the Blue Basket is also a match? If just one fruit had a different amount picked e.g. apples 2 would it still be a match?

You say up to 10 fruits. Are all 10 listed in the table for each basket with a value of 0 if none were picked?

Share this post


Link to post
Share on other sites

Could well be way off here but:

SELECT Picker, Basket, BasketItem1

FROM tblA a JOIN tblB b ON a.Picker = b.Picker

WHERE a.BasketItem1 = b.BasketItem1

UNION

SELECT Picker, Basket, BasketItem2

FROM tblA a JOIN tblB b ON a.Picker = b.Picker

WHERE a.BasketItem2 = b.BasketItem2

UNION

etc

Share this post


Link to post
Share on other sites

Could well be way off here but:

SELECT Picker, Basket, BasketItem1

FROM tblA a JOIN tblB b ON a.Picker = b.Picker

WHERE a.BasketItem1 = b.BasketItem1

UNION

SELECT Picker, Basket, BasketItem2

FROM tblA a JOIN tblB b ON a.Picker = b.Picker

WHERE a.BasketItem2 = b.BasketItem2

UNION

etc

Couldn't he just do it as one? e.g.

SELECT Picker, Basket, BasketItem1, Basketitem2, (through to B10)

FROM tblA a JOIN tblB b ON a.Picker = b.Picker

WHERE a.BasketItem1 = b.BasketItem1 AND a.BasketItem2 = b.BasketItem2 AND (through to B10)

That was why I was asking if just one different fruit number meant it wasn't a match and if the non picked fruits had zero values.

Share this post


Link to post
Share on other sites

Many thanks for the replies guys - it will take me some time to run through the suggestions and give some answers, but one question I can answer now is:

I assume the Blue Basket is also a match? If just one fruit had a different amount picked e.g. apples 2 would it still be a match?

You say up to 10 fruits. Are all 10 listed in the table for each basket with a value of 0 if none were picked?

Yes, blue basket is a match because the colour is ignored.

If no fruits were picked, then the 10 columns containing field name and the 10 columns containing the related values would be blank. And if only 5 were picked, then 5 would be populated and 5 would be blank - hope that makes sense.

I am going to have a banana and try some of the suggestions - cheers! ;)

Share this post


Link to post
Share on other sites

Since Fully Detached seems to have nipped off for the moment, and as I'm not likely to see such a convergence of data processing types on here again any time soon, I can't help myself blatently hyjacking the thread to mention an online tool I've just published which might occassionally come in handy for some of you: whalemerge.com. Would be interested to know what anyone thinks. It's pretty basic but does handle some stuff I find myself doing every now and again. (Sorry Fully Detached, I'll give your problem my full attention in return!).

Share this post


Link to post
Share on other sites

Couldn't he just do it as one? e.g.

SELECT Picker, Basket, BasketItem1, Basketitem2, (through to B10)

FROM tblA a JOIN tblB b ON a.Picker = b.Picker

WHERE a.BasketItem1 = b.BasketItem1 AND a.BasketItem2 = b.BasketItem2 AND (through to B10)

That was why I was asking if just one different fruit number meant it wasn't a match and if the non picked fruits had zero values.

Guys, I am sorry, now I have some food inside me I realise I've messed up the question. The good news is I think it's more simple than I thought previously.

Firstly, I should make clear that what I am trying to do is find where rows of data are present in table A, but not in table B, so that I can update table B to make the two match. Table A will always be the master table, there is no need for two way synchronisation.

Secondly, I gave you the wrong column info. The columns are:

Picker

Basket

Fruit

The Picker field is nice and straightforward, just a simple name

The Fruit field is also quite straightforward - we can forget numbers entirely for now, we just need to know that there are bananas in one basket, but bananas and oranges in another.

The Basket field is the pivotal one (in fact I use this field in an Excel Pivot table as JUst Yield suggested) when I want to be able to display the contents of all baskets picked by Dave, for example.

Here are some actual rows of data, as it may explain more clearly:

Picker Basket Fruit

Dave Blue Bananas

Dave Blue Oranges

Dave Blue Apples

Dave Red Bananas

Dave Red Melons

Dave Green Kiwis

Dave Green Oranges

Dave Green Berries

I hope that formats ok - the point is that the first three lines form a single instance of a basket and that is what I need to compare. The next two lines form another instance of a basket, and the last three form another.

Now, if the colour of the basket were the same in both tables, it would be more simple, but imagine that in table B, the colours of the baskets were different, but the pickers and the contents were the same, then that is exaclty what I am trying to achieve.

Sorry for the crap question framing to start with :(

Share this post


Link to post
Share on other sites

Firstly, I should make clear that what I am trying to do is find where rows of data are present in table A, but not in table B, so that I can update table B to make the two match. Table A will always be the master table, there is no need for two way synchronisation.

Now he tells us :lol:

Do you have Access? You could probably do it in 2 minutes with Create Table using Wizard.

Share this post


Link to post
Share on other sites

Now he tells us :lol:

Do you have Access? You could probably do it in 2 minutes with Create Table using Wizard.

Yes, sorry about that :)

I don't have access but I do have MySql on my computer and can run queries.

I'm in the process of looking at Excel arrays, as I think they might be a solution as well.

Share this post


Link to post
Share on other sites

Has the requirement to compare on the numeric value for each fruit been dropped?

Not dropped no, but I figured it was easier to leave that out while I corrected what I had said orignally.

So to answer your earlier question, yes, the numbers of each fruit also need to match, but I don't think that should complicate it too much. The tricky bit (I think) is in converting multiple rows of table data into something that can be compared, but where one of the fields is certain to be different, but that this field acts as a "container" for all the other fields.

My programming knowledge is practically zero, but I am thinking of the basket as being like an array, where the comparison needs to be made against the contents of the array, but where the array name itself does not need to match.

Tell me to shut up if I'm just confusing things more ;)

Share this post


Link to post
Share on other sites

If you don't absolutely have to have an Excel or SQL based solution and you're willing to trust a bit of programming, I think it would be easy enough to do in code, for instance in C# I believe this would do the trick for your simplified sample data:

var grouped1 = table1.OrderBy(x => x.Fruit).GroupBy(x => new { x.Picker, x.Basket });
var grouped2 = table2.OrderBy(x => x.Fruit).GroupBy(x => new { x.Picker, x.Basket });
var unmatched = grouped1.Where(g => grouped2.Where(x => x.Key.Picker == g.Key.Picker && x.Select(y => y.Fruit).SequenceEqual(g.Select(y => y.Fruit))).Count() != 1);

Share this post


Link to post
Share on other sites

If you don't absolutely have to have an Excel or SQL based solution and you're willing to trust a bit of programming, I think it would be easy enough to do in code, for instance in C# I believe this would do the trick for your simplified sample data:

var grouped1 = table1.OrderBy(x => x.Fruit).GroupBy(x => new { x.Picker, x.Basket });

var grouped2 = table2.OrderBy(x => x.Fruit).GroupBy(x => new { x.Picker, x.Basket });

var unmatched = grouped1.Where(g => grouped2.Where(x => x.Key.Picker == g.Key.Picker && x.Select(y => y.Fruit).SequenceEqual(g.Select(y => y.Fruit))).Count() != 1);

Thanks evictee - although my programming skills are non existent I can pretty much see what you're doing there. Might be time to download eclipse and have a stab at it. This could get messy ;)

Share this post


Link to post
Share on other sites

You can do it in SQL using with a HAVING clause. The comparison would be based off the COUNT being equal the SUM of a CASE statement that evaluated 1 for matching detail lines and 0 for non-matching. It's definitely possible but you will have to get your head around the SQL set based stuff and for large sets of data it won't have good performance.

Share this post


Link to post
Share on other sites

You can do it in SQL using with a HAVING clause. The comparison would be based off the COUNT being equal the SUM of a CASE statement that evaluated 1 for matching detail lines and 0 for non-matching. It's definitely possible but you will have to get your head around the SQL set based stuff and for large sets of data it won't have good performance.

Thanks TPM - that sounds very interesting, but probably a bit too advanced for my current level of understanding :)

I will try the C# suggestion, and see how I get on with that. Otherwise I'll spend the weekend learning some SQL and give that a try.

Share this post


Link to post
Share on other sites

  • Recently Browsing   0 members

    No registered users viewing this page.

  • The Prime Minister stated that there were three Brexit options available to the UK:   215 members have voted

    1. 1. Which of the Prime Minister's options would you choose?


      • Leave with the negotiated deal
      • Remain
      • Leave with no deal

    Please sign in or register to vote in this poll. View topic


×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.