Jump to content
House Price Crash Forum
honeybadger

Oddities in Land Registry data (for data nerds only)

Recommended Posts

I download Land Registry price-paid data every month to monitor prices. The files I download are the yearly ones from 2005 onwards.

My expectation was that the data for years in the distant past would be unchanging from one download to the next. However, that is not the case; even for the earliest years, there is variation. Attached is a plot that shows, for each year, the number of datapoints in the file (y axis) against the download date (x axis).

E.g., for 2005 data, there are around 1000 fewer datapoints in the most recently downloaded data, compared to data download in Jan 2017. This is out of a total of approx 1 million datapoints. So in relative terms, the variation is not large. Nevertheless, it is surprising.

There's also a common pattern in years 2005-2013. The number of datapoints consistently increased with each download until late 2017/early 2018, when suddenly there were large removals of data (the exact month of the removals varies).

This raises a couple of questions:

1. Why were properties being consistently added to datasets from almost 15 years ago even recently?

2. Why was data removed in late 2017/early 2018?

If anyone knows what might be going on here I'd be quite interested!

LR.png

Share this post


Link to post
Share on other sites

The removals could be a data cleaning process.

Have you identified those addition / removals?

Have you compared their characteristics compared to the general population?

Can you identify duplicates?

Share this post


Link to post
Share on other sites

Good point regarding data cleaning, that could certainly explain the removals.

I haven't yet done anything other than the counting presented above, but I do intend to go a bit deeper into it.

 

Share this post


Link to post
Share on other sites
6 hours ago, Freki said:

Can you identify duplicates?

 

Yes. It seems that in the older files there are duplicated entries with different prices, one of which gets removed in the newer files.

I don't yet have an explanation for the steady flow of additions in the older files.

 

Share this post


Link to post
Share on other sites

Are those entries creating duplicates? 

It could be their dirty way of updating a field on a row... You never know

Edited by Freki

Share this post


Link to post
Share on other sites

Additions to the older files seems odd. Of course, newer files are updated as and when people register the sale - which can take a while - but I doubt that's what's happening in this case. I guess you could contact them directly and ask?

Data Services Team
HM Land Registry
Rosebrae Court 
Woodside Ferry Approach
Birkenhead
Merseyside
CH41 6DU

Email: data.services@landregistry.gov.uk

Telephone: 0300 006 0478

 

Share this post


Link to post
Share on other sites

https://www.gov.uk/government/publications/quality-assurance-of-administrative-data-in-the-uk-house-price-index/hm-land-registry-data

Section 4.3 refers to checks on Price Paid info and data. Shorter term target accuracy seems to be 98% so e.g. 1000/1000000 well within that margin of error, but assume it's corrected/improved upon over time with periodic bulk amendments in annual data.

There's some info here about changes to the HPI estimation methodology from Dec 2017:
https://www.gov.uk/government/publications/about-the-uk-house-price-index/quality-and-methodology

While about the HPI not price paid, it refers to delays in processing new registrations, such as for new build transactions (2.2). Perhaps this results in (and resulted in) adjustments to dates of associated price paid data?

As above, maybe just ask them - they've been cool with checking/explanation when I've had questions, even when I was getting it completely wrong or just grumbling about change to data formats.

Share this post


Link to post
Share on other sites
10 hours ago, Horseradish said:

Additions to the older files seems odd. Of course, newer files are updated as and when people register the sale - which can take a while - but I doubt that's what's happening in this case. I guess you could contact them directly and ask?


Data Services Team
HM Land Registry
Rosebrae Court 
Woodside Ferry Approach
Birkenhead
Merseyside
CH41 6DU

Email: data.services@landregistry.gov.uk

Telephone: 0300 006 0478

 

 

6 hours ago, guest_northshore said:

As above, maybe just ask them - they've been cool with checking/explanation when I've had questions, even when I was getting it completely wrong or just grumbling about change to data formats.

 

I'll do this, thanks for the help. Will update this thread when I get an answer.

 

Share this post


Link to post
Share on other sites
11 hours ago, Freki said:

Are those entries creating duplicates? 

It could be their dirty way of updating a field on a row... You never know

I don't think the additions are creating duplicates. I just took a few at random and it wasn't the case for those.

Share this post


Link to post
Share on other sites

Text of email sent to the land registry:

 

Dear land registry data team,

I've found that over time, additions are made to the price paid
data for sales that happened a very long time ago. For instance,
the sales with the IDs pasted below all occurred in 2005, but
they did not appear in the downloadable data (i.e., pp-2005.csv)
until April 2017.

I'd be very grateful if you could explain how this can happen.

Best regards,
[Honeybadger]


49B7852A-A008-7921-E050-A8C063056E8D
49B7852A-2C53-7921-E050-A8C063056E8D
49B78529-DE65-7921-E050-A8C063056E8D
49B7852A-9BB4-7921-E050-A8C063056E8D
49B7852A-9B99-7921-E050-A8C063056E8D
49B78529-AFF0-7921-E050-A8C063056E8D
49B7852A-89A7-7921-E050-A8C063056E8D
49B7852A-4973-7921-E050-A8C063056E8D
49B7852A-7A08-7921-E050-A8C063056E8D
49B7852A-5364-7921-E050-A8C063056E8D
49B7852A-D6A9-7921-E050-A8C063056E8D
49B78529-C73D-7921-E050-A8C063056E8D
49B7852A-BB5F-7921-E050-A8C063056E8D
49B7852A-BA36-7921-E050-A8C063056E8D
49B7852A-026F-7921-E050-A8C063056E8D
49B7852A-5750-7921-E050-A8C063056E8D
49B7852A-A7C6-7921-E050-A8C063056E8D
49B78529-BAE4-7921-E050-A8C063056E8D
49B7852A-0B3B-7921-E050-A8C063056E8D
49B78529-C7C1-7921-E050-A8C063056E8D
49B7852A-282A-7921-E050-A8C063056E8D
49B7852A-9CA1-7921-E050-A8C063056E8D
49B7852A-CBF5-7921-E050-A8C063056E8D

Share this post


Link to post
Share on other sites
7 hours ago, honeybadger said:

Text of email sent to the land registry:

 

Dear land registry data team,

I've found that over time, additions are made to the price paid
data for sales that happened a very long time ago. For instance,
the sales with the IDs pasted below all occurred in 2005, but
they did not appear in the downloadable data (i.e., pp-2005.csv)
until April 2017.

I'd be very grateful if you could explain how this can happen.

Best regards,
[Honeybadger]


49B7852A-A008-7921-E050-A8C063056E8D
49B7852A-2C53-7921-E050-A8C063056E8D
49B78529-DE65-7921-E050-A8C063056E8D
49B7852A-9BB4-7921-E050-A8C063056E8D
49B7852A-9B99-7921-E050-A8C063056E8D
49B78529-AFF0-7921-E050-A8C063056E8D
49B7852A-89A7-7921-E050-A8C063056E8D
49B7852A-4973-7921-E050-A8C063056E8D
49B7852A-7A08-7921-E050-A8C063056E8D
49B7852A-5364-7921-E050-A8C063056E8D
49B7852A-D6A9-7921-E050-A8C063056E8D
49B78529-C73D-7921-E050-A8C063056E8D
49B7852A-BB5F-7921-E050-A8C063056E8D
49B7852A-BA36-7921-E050-A8C063056E8D
49B7852A-026F-7921-E050-A8C063056E8D
49B7852A-5750-7921-E050-A8C063056E8D
49B7852A-A7C6-7921-E050-A8C063056E8D
49B78529-BAE4-7921-E050-A8C063056E8D
49B7852A-0B3B-7921-E050-A8C063056E8D
49B78529-C7C1-7921-E050-A8C063056E8D
49B7852A-282A-7921-E050-A8C063056E8D
49B7852A-9CA1-7921-E050-A8C063056E8D
49B7852A-CBF5-7921-E050-A8C063056E8D

Keen to know what they say!

Share this post


Link to post
Share on other sites
On 07/07/2018 at 14:25, Horseradish said:

Keen to know what they say!

Here's the response:

Quote

Firstly each months file will always contain older data, whether these are additions, changes or deletes. This is because we report on transactions as they are lodged with Land Registry and these include properties being registered voluntarily for the first time, which can date back quite far. We also add any missing sales as they are reported to us by customers using the customer change request form. Some sales are incorrectly excluded at the time of registration and when we are informed we endeavour to put this right.

In the most recent month files however, we began using an alternative system for obtaining our data, this enabled us to identify transactions that we were previously unable to provide due to missing information. We also applied some changes to data that we had identified to be incomplete or inaccurate at the time of first publication, these would be seen as changes to the data.

It doesn't really provide a concrete answer to the specific question I asked, so I wrote back to ask for clarification on the particular entries I gave.

Share this post


Link to post
Share on other sites

I am fascinated by this thread. About two years ago some friends bought a place on the outskirts of London and due to my nosey nature I looked up their sale price. Later on I needed their house number so I tried the reverse - I looked for the sale at the right month and price but it was gone. The rightmove advert is even still online (though inactive). I have been certain that their house appeared on land registry for a few months and then promptly disappeared. I have even considered contacting Land registry to find out what the deal was but I figured that really it wasn't my business. Now this thread reveals that this isn't an isolated case. 

My working hypothesis can only be sinister: as always these friends said they felt they had achieved "a great price for the area", yadda yadda we hear it every time someone buys. But what if they did? The only competent reason to remove their sale listing is that genuinely they did get a good deal and tptb didn't want it to impact the hpi figures. That is the only rational reason I can conceive of. Otherwise the only remaining possibility is that land registry are totally incompetent and routinely lose data. 

Anyway thanks for flagging that this is not a one off. I am very interested to hear what you find. If there is an easy was to check if their sale price would contradict official land registry hpi figures then let me know and I will check. 

Share this post


Link to post
Share on other sites

Mmmm...this all sounds pretty dodgy to me! Sounds like someone wants to selectively use only the "nice" data they want to use?

36 minutes ago, bushblairandbrown said:

...

The only competent reason to remove their sale listing is that genuinely they did get a good deal and tptb didn't want it to impact the hpi figures. That is the only rational reason I can conceive of. Otherwise the only remaining possibility is that land registry are totally incompetent and routinely lose data. 

 

 

Share this post


Link to post
Share on other sites

I think you should put more trust into the system. Especially for this kind of things. I would not go down the conspirary theory road.

And thanks for the answer still interesting

Edited by Freki

Share this post


Link to post
Share on other sites
2 minutes ago, Freki said:

I think you should put more trust into the system. Especially for this kind of things. I would not go down the conspirary theory road.

And thanks for the answer still interesting

I'm also not inclined to assume a nefarious cause.

Nevertheless, it would be nice to understand the regions of almost constant upward slope in the plots. I can't think of a good explanation for that.

Share this post


Link to post
Share on other sites
1 hour ago, bushblairandbrown said:

Anyway thanks for flagging that this is not a one off. I am very interested to hear what you find. If there is an easy was to check if their sale price would contradict official land registry hpi figures then let me know and I will check. 

Not sure if this is what you're after, but I can easily look up any entry in the official LR data. Send me a message if that would be of interest.

Share this post


Link to post
Share on other sites

I very much doubt that the Land Registry would deliberately fiddle the figures, but I have long suspected that Estate Agents might deliberately register high value sales immediately but 'accidentally on purpose' forget to register low value sales for a few years - perhaps after a more recent high value sale in the same area - so it always looks like a rising market.

The only way to stop that kind of massaging would be to set a deadline, perhaps 6 months or a year and refuse to register any sale after the deadline has passed.

Ideally I'd go a step further and make it impossible to transfer the title deeds without a registered sale price at the Land registry - how hard could it be to do this stuff properly?

Share this post


Link to post
Share on other sites
9 hours ago, Habeas Domus said:

I very much doubt that the Land Registry would deliberately fiddle the figures, but I have long suspected that Estate Agents might deliberately register high value sales immediately but 'accidentally on purpose' forget to register low value sales for a few years - perhaps after a more recent high value sale in the same area - so it always looks like a rising market.

The only way to stop that kind of massaging would be to set a deadline, perhaps 6 months or a year and refuse to register any sale after the deadline has passed.

Ideally I'd go a step further and make it impossible to transfer the title deeds without a registered sale price at the Land registry - how hard could it be to do this stuff properly?

That sounds like a lovely idea.

Share this post


Link to post
Share on other sites

One thing I saw recently was a neighbour who bought 2years ago for approx £300k ‘sold’ for £400k recently. However, the visible owners did not change, car etc. I didn’t look into it too deeply I guessed maybe he sold the house to his owned company or something. Presumably some kind of tax or finance efficient effort. I didn’t delve further.

Share this post


Link to post
Share on other sites
On 14/07/2018 at 00:39, Habeas Domus said:

I very much doubt that the Land Registry would deliberately fiddle the figures, but I have long suspected that Estate Agents might deliberately register high value sales immediately but 'accidentally on purpose' forget to register low value sales for a few years - perhaps after a more recent high value sale in the same area - so it always looks like a rising market.

The only way to stop that kind of massaging would be to set a deadline, perhaps 6 months or a year and refuse to register any sale after the deadline has passed.

Ideally I'd go a step further and make it impossible to transfer the title deeds without a registered sale price at the Land registry - how hard could it be to do this stuff properly?

Estate agents play no part in the registration process. The purchaser's solicitors lodge the paperwork. Where there is a mortgage, the mortgagee will pay close attention to the completion of the registration.

It is impossible already to register a sale without a sale price as it is an integral part of the conveyancing document. 

#conspirasytheory #fullofholes

Share this post


Link to post
Share on other sites

Ok so following a bit more research here is an update. 

Looking at the land reg hpi figures for the LA and the type of property, the sale price was in excess of the hpi figures by over 12%. I wonder then if they removed the sale because it was an outlier. There was a good reason for it being an outlier because it had been extended. Not sure if it would be at all logical to remove the sale from the data on those grounds but perhaps it could get added back in if it sells again. Maybe I am just wrong. Who knows? 

Share this post


Link to post
Share on other sites

Well, I got another reply when asking about some specific entries from 2005 that only started appearing in 2017. The response was as follows:

Quote

The entries quoted from 2005 appear to be part of the continuing cleansing process mentioned in our previous correspondence. 

They've been surprisingly responsive, but without giving very much detail. I don't think it's really worth pursuing further, given the small number of datapoints that are involved.

Share this post


Link to post
Share on other sites
1 hour ago, honeybadger said:

Well, I got another reply when asking about some specific entries from 2005 that only started appearing in 2017. The response was as follows:

They've been surprisingly responsive, but without giving very much detail. I don't think it's really worth pursuing further, given the small number of datapoints that are involved.

Sounds fair enough. I was wondering more about the removals, but given the 1000/1000000 change probably unlikely that there's any collective explanation beyond something like methodology-tweak, or just errors.

Guess an analysis of differences (e.g. any pattern in distribution/initial registration date) may tell you more, but would be a big hassle for a small and probably not meaningful relative change to total.

Edited by guest_northshore

Share this post


Link to post
Share on other sites

I bought a new-build at the end of 2008 and my neighbour bought a month before me.  My house appeared in the sales data as did all the subsequent purchases in the development.  However my neighbour's property didn't appear until 2016 when it was mysteriously added. 

I had considered I'd bought at a good price as everything had been dropping like a stone in the previous few months and people buying in the spring/summer of 2008 had paid a lot more as it turned out.  There is at least a 3 month lag usually before you can get this info.  Anyway it turned out the neighbour paid even less than I did. 

I put it down to a lazy/dodgy solicitor but I wonder if it was deliberately held back as he got the best bargain.  He was a FTB so maybe there was some other deal which delayed it but 8 years is some delay!

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

  • Recently Browsing   0 members

    No registered users viewing this page.

  • 342 Brexit, House prices and Summer 2020

    1. 1. Including the effects Brexit, where do you think average UK house prices will be relative to now in June 2020?


      • down 5% +
      • down 2.5%
      • Even
      • up 2.5%
      • up 5%



×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.