R*-tree in Go: usage examples

Обложка: Как искать данные в пространстве невероятно быстро? R*-tree в Go

Hello everyone I am a game developer in three-dimensional games with a lot of experience and am currently switching to the Go language and switching to Highload. Today I would like to tell you what an interesting idea I came across and solved one interesting problem.

Have you ever wondered how to work quickly with any spatial data so that it is fast and efficient?

***

Background

In my experience, I have participated in projects with a workload of 2-3 thousand people.
On average, about 500-1000 people are located on one gaming “machine”.
On the experience there was one project where there were 3 thousand people at all: two game servers of 1500 thousand people

For the most part, I dealt with the client and the logical core of the server in the form of database accesses, working with the cache and communication between the client and the server.
Therefore, in all these game projects, unfortunately, I did not have access to a lower level of the network like “synchronization of players between other players”, except for one project where one person helped me. He told me about how synchronization works between players in his game.

Imagine a three-dimensional game map with 1000 players on the server.  The map is big enough, so it doesn’t make sense for players to see each other over huge distances. This is a meaningless load.

It is most effective for the player to give some kind of field of view (for example, a circle or square around the player), where he would see the players nearby. This is how it works in the game, and such a field of view is called “stream zone».

How is this stream zone calculated?

We will not delve into the life cycle of the game server … Literally every second the player’s visibility zone is recalculated on the server. They — the players — are constantly moving, moving around the map by car or on foot.

So, the same person who helped me told me that in his game with 1000 online servers, the calculation goes like this: a player is taken — all other players are calculated to him, and so for all players. That is, it is a 1000 * 1000 cycle.

I was very surprised, so I started digging…

***

We approach the R-tree. What is it, why and why?

While I was digging through databases to prepare for interviews on Go, I came across the GiST index, which is a subspecies of the R-tree structure. I started to study what it is — and that’s exactly what I needed!

What is R-tree? R-tree is a tree—like data structure that allows you to work with geospatial data fairly quickly.

But R*-tree what is it then? R*-tree is a subspecies of the R*—tree structure, which is easier to build, but at the same time allows you to access data faster than a regular R-tree

***

A little more about the field in which I work

I told this person about this data structure (R-tree) and threw off one interesting article, and he gladly accepted this idea and will implement it in the near future.

I showed the developer literally by example, taking the rtreego library, with what difference it is possible to work with this data in time. The load with 4000 players with R-tree can be the same as it has with 1000 players when it is implemented!

I was very pleased, because I solved one interesting problem, but here it was even more interesting… I myself skimmed through the article that I threw off, and this person pointed out to me that the article describes R*-tree, and not just R-tree.

I was extremely interested to find out the difference between these algorithms, so I began to dig further…

***

What is the difference between R-tree and R*-tree?

R-tree is incredibly easy to build and just as incredibly easy to find. This tree has no requirements.

R*-tree is more difficult to build, but it is more effective in search queries.

What is so difficult about building an R*-tree?

Let’s take as an example the construction of R-tree, R-tree with linear separation and R*-tree:

In the third variant, we can see a clearer and more obvious separation of elements by branches (red squares) and trees (blue squares), while in the first and second variants we have the most real porridge.

With the usual insertion of an element into an R*-tree, the tree works the same way as in an R-tree, but.
When the “leaves” on which the elements are stored need to be separated — R-tree does it without steaming, while R*-tree tries to make it so that:

  1. The leaves occupied as small an area as possible
  2. The leaves overlapped as little as possible on other leaves nearby
  3. The leaves were equally distributed

Due to compliance with these three rules, the construction can be 2-3 times more difficult in time, but! Searching by elements will be 1.5 — 3 times more efficient.

If you need to use static data, R*-tree will be for you!

So where should I use R-tree, and where is R*-tree?

We realized that the R-tree is quite fast when inserting, but slower in search.
It is impossible to simply take and “move” data through the tree, but you can only delete an element and insert it again.
Therefore, we come to the conclusion.

Where can I use R-tree?

  1. Game streaming (the player’s field of view in space)
  2. Maps where items can move (the same maps with couriers or taxis, for example!)
  3. Any geodata that is updated quite often, but reading is a little less important.

Where can I use R*-tree?

  1. The same game streaming, but static objects! For example, the player has a huge map with many elements (like cities in GTA, for example). Searching for these elements, even if there are a million of them, can be done in miles (or even nano) seconds!
  2. Maps where there are incredibly many static elements. (For example, a map of establishments, houses, or elements that change quite rarely)
  3. Any geodata that is almost not updated, but requires incredibly fast reading

These two data structures can be combined.
For example, a courier delivery map, on which ordinary houses are located in one structure, and in the second there will be couriers who move.

***

Thus, I found out exactly where in games (and not only) it is possible to use this data incredibly effectively, so I will take this data structure into service.

***

We return to the question of R*-tree in the Go language…

It took me quite a long time to figure out how it works.
First I tried to rewrite the R*-tree from Delphi to Go. It took me two days and… it didn’t work out because I didn’t understand the structure of this structure.

Then I figured out the way for myself:

  1. Completely rewrite and disassemble every small action in a ready-made R-tree solution (rtreego)
  2. Parse an article about R*-tree, which had Delphi and pseudocode sources
  3. Make from R-tree — R*-tree

Spoiler alert: I managed it

The sources of what was done and what it was made of

Firstly, I want to show my great gratitude to the person who created rtreego — dhconnelly (the author of this library).

Secondly, I want to express my incredible gratitude to the person who literally described an article about R*-tree on his own experience. There was incredibly little information, but this article answered many questions! Article *tyk*

Well, and thirdly, articles from the wiki: R*-tree (wikipediam), R-tree (wikipedia in Russian)

From rtreego, I took R-tree as a basis and, after digesting the Delphi code and pseudocode from the article, I managed to create an R*-tree in Go, which works exactly as described by the algorithm.

Link to the repository:

The most important thing —

Thank you all so much for your attention and for being able to master this incredibly not a small one an article!

Outsourced Software Development Services | Dedicated Software Development Team

Ready to see us in action:

More To Explore

IWanta.tech
Logo
Enable registration in settings - general
Have any project in mind?

Contact us:

small_c_popup.png