How to Invoer the Bitcoin Blockchain into Neo4j

By Greg Walker, learn mij a bitcoin | January 9, 2018

[Spil community content, this postbode reflects the views and opinions of the particular author and does not necessarily reflect the official stance of Neo4j.]

This guide runs through the basic steps for importing the bitcoin blockchain into a Neo4j graph database.

The entire process is just about taking gegevens from one format (blockchain gegevens), and converting it into another format (a graph database). The only thing that makes this slightly trickier than typical gegevens conversion is that it’s helpful to understand of the structure of bitcoin gegevens before you get began.

However, once you have imported the blockchain into Neo4j, you can perform analysis on the graph database that would not be possible with SQL databases. For example, you can go after the path of bitcoins to see if two different addresses are connected:

Screenshot of connected Bitcoin Addresses te the Neo4j Browser.

Te this guide I will voorkant:

  1. How bitcoin works, and what the blockchain is.
  2. What blockchain gegevens looks like.
  3. How to invoer the blockchain gegevens into Neo4j.

This isn’t a finish tutorial on how to write your own importer contraption. However, if you’re interested, you can find my bitcoin-to-neo4j code on GitHub, albeit I’m sure you could write something cleaner after reading this guide.

1. What Is Bitcoin?

Bitcoin is a pc program.

It’s a bit like uTorrent, you run the program, it connects to other computers running the same program, and it shares a opstopping. However, the cool thing about bitcoin is that anyone can add gegevens to this collective opstopping, and any gegevens already written to the verkeersopstopping cannot be tampered with.

Spil a result, Bitcoin creates a secure verkeersopstopping that is collective on a distributed network.

What can you do with this?

Ter bitcoin, each lump of gegevens that gets added to this opstopping is a transaction. Therefore, this decentralised opstopping is being used spil a “ledger” for a digital currency (i.e., cryptocurrency).

This ledger is called the blockchain.

Where can I find the blockchain?

If you run the Bitcoin Core program, the blockchain will be stored te a folder on your laptop:

/Library/Application Support/Bitcoin/blocks

  • Mac: C:\Users\YourUserName\Appdata\Wandering\Bitcoin\blocks
  • When you open this directory you should notice that instead of one big opstopping, you will find numerous files with the name blkXXXXX.dat . This is the blockchain gegevens, but split across numerous smaller files.

    Two. What Does the Blockchain Look Like?

    The blk.dat files contain serialized gegevens of blocks and transactions.

    Blocks

    Blocks are separated by magic bytes, which are then followed by the size of the upcoming block.

    Each block then starts with a block header:

    A block is basically a container for a list of transactions. The header is like the metadata at the top.

    Block Header Example:

    Transactions

    After the block header, there is a byte that tells you the upcoming number of transactions ter the block. After that, you get serialized transaction gegevens, one after the other.

    A transaction is just another chunk of code again, but they are more structurally interesting.

    Each transaction has the same pattern:

    1. Select Outputs (wij call thesis Inputs).
      • Unlock thesis inputs so that they can be spent.
      • Create Outputs
        • Lock thesis outputs to a fresh address.

        So after a series of transactions, you have a transaction structure that looks like something this:

        This is a simplified diagram of what the blockchain looks like. Spil you can see, it looks like a graph.

        Transaction Example:

        Trio. How to Invoer the Blockchain into Neo4j

        Well, now wij know what the blockchain gegevens represents (and that it looks a lotsbestemming like a graph), wij can go ahead and invoer it into Neo4j. Wij do this by:

        1. Reading through the blk.dat files.
        2. Decoding each block and transaction wij run into.
        3. Converting the decoded block/transaction into a Cypher query.

        Here’s a visual guide to how I represent Blocks, Transactions and Addresses te the database:

        Blocks

        1. CREATE a :block knot, and connect it to the previous block it builds upon.
          • SET each field from the block header spil properties on this knot.
          • CREATE a :coinbase knot coming off each block, spil this represents the “new” bitcoins being made available by the block.
            • SET a value property on this knot, which is equal to the block prize for this block.

            Transactions

            1. CREATE a :tx knot, and connect it to the :block wij had just created.
              • SET properties (version, locktime) on this knot.
              • MERGE existing :output knots and relate them [:te] to the :tx .
                • SET the unlocking code spil a property on the relationship.
                • CREATE fresh :output knots that this transaction creates.
                  • SET the respective values and locking codes on thesis knots.

                  Addresses

                  If the

                  locking code on an :output contains an address…

                  1. CREATE an :address knot, and connect the output knot to it.
                    • SET the address spil a property on this knot.
                    • Note: If different outputs are connected to the same address, then they will be connected to the same address knot.

                    Four. Cypher Queries

                    Here are some example Cypher queries you could use for the poot of inserting blocks and transactions into Neo4j.

                    Note: You will need to decode the block headers and transaction gegevens to get the parameters for the Cypher queries.

                    Block

                    Parameters (example):

                    Transaction

                    Note: This query uses the FOREACH hack, which acts spil a conditional and will only create the :address knots if the $addresses parameter actually contains an address (i.e., if it is not empty).

                    Parameters (example):

                    Five. Results

                    If you have inserted the blocks and transactions using the Cypher queries above, then thesis are some examples the zuigeling of results you can get out of the graph database.

                    Block

                    Transaction

                    Address

                    Paths

                    Finding paths inbetween transactions and addresses is most likely the most interesting thing you can do with a graph database of the bitcoin blockchain, so here are some examples of Cypher queries for that:

                    Inbetween Outputs
                    Inbetween Addresses

                    Conclusion

                    This has bot a plain guide on how you can take the blocks and transactions from blk.dat files (the blockchain) and invoer them into a Neo4j database.

                    I think it’s worth the effort if you’re looking to do serious graph analysis on the blockchain. A graph database is a natural getraind for bitcoin gegevens, whereas using an SQL database for bitcoin transactions feels like attempting to shove a square peg into a round fuckhole.

                    I’ve attempted to keep this guide klein, so I toevluchthaven’t covered things like:

                    1. Reading through the blockchain. Reading the blk.dat files is effortless enough. However, the annoying thing about thesis files is that the blocks are not written to thesis files ter sequential order, which makes setting the height on a block or calculating the toverfee for a transaction a bit trickier (but you can code around it).
                    2. Decoding blocks and transactions. If you want to use the Cypher queries above, you will need to get the parameters you require by decoding the block headers and raw transaction gegevens spil you go. You could write your own decoders, or you could attempt using an existing bitcoin library.
                    3. Segregated Witness. I’ve only given a Cypher query for an “original” style transaction, which wasgoed the only transaction structure used up until block 481,824. However, the structure of a segwit transaction is only slightly different (but it might need its own Cypher query).

                    Nonetheless, hopefully this guide has bot somewhat helpful.

                    But spil always, if you understand how the gegevens works, converting it to a different format is just a matter of sitting down and writing the instrument.

                    Click below to get your free copy of the Learning Neo4j ebook and catch up to speed with the world’s leading graph database technology.

                    Share Your Thoughts