Skip to content

Handling Long String Storage in Solana Programs (On-chain Size Limits & Off-chain Hashing Trade-offs) #54

@danielwangai

Description

@danielwangai

Hi,
First, thank you for the Solana boot camp resources. They are phenomenal.

I made an observation on project 4, crud-app. In the video, the journal entry has a title and a message as some of its fields with length constraints, i.e.

#[account]
pub struct JournalEntryState {
    pub owner: Pubkey,
    pub title: String, // max length of 50
    pub message: String,// max length of 1000
}

After implementing the CreateEntry account and its instruction create_journal_entry, it builds fine, with no errors.

I then wrote a test to check whether the program rejects a message longer than 1000 characters:

it("rejects message longer than 1000 characters", async () => {
      await airdrop(bob.publicKey);
      try {
        let title = "unique title";
        let longMessage = "x".repeat(1001);
        const [pda] = getJournalEntryAddress(
          title,
          alice.publicKey,
          program.programId,
        );
        await program.methods
          .createJournalEntry(title, longMessage)
          .accounts({
            journalEntry: pda,
            owner: bob.publicKey,
            systemProgram: anchor.web3.SystemProgram.programId,
          })
          .signers([bob])
          .rpc({ commitment: "confirmed" });
      } catch (error) {
        console.log("ERROR>>: ", error);
      }
    });

// get pda journal entry
const getJournalEntryAddress = (
    title: string,
    owner: PublicKey,
    programId: PublicKey,
  ) => {
    return anchor.web3.PublicKey.findProgramAddressSync(
      [anchor.utils.bytes.utf8.encode(title), owner.toBuffer()],
      programId,
    );
  };

I get this error:-

RangeError: encoding overruns Buffer

So I decided to reduce the length of the message in the test to 900 characters and get a different error:-

Transaction too large: 1260 > 1232

So the test fails when trying to store long strings. This got me curious about how to handle really large content on-chain.

One approach I tried was hashing the message and storing only the hash on-chain. The downside is that hashing is one-way — you can’t recover the original message from the hash. This also means that if you fetch the data directly from the blockchain, all you see is a hash value, which isn’t human-readable or useful on its own. To make it practical, I stored the hash on-chain while keeping the full content off-chain. This adds some extra complexity: you need to hash the message when writing, and you must maintain both the on-chain and off-chain versions in sync during updates or deletes. The trade-off is worth it though — with this approach, I was able to handle strings as long as 3000 characters.

cc @brimigs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions