PIR-Eng-Notes/research/Group.1b.Stateless.Client.Stateless.Server/hintlesspir_2023/HintlessPIR_2023_notes.md at main · 0xalizk/PIR-Eng-Notes

HintlessPIR — Engineering Notes

_{1. Lineage}
_{2. Core Idea}
_{3. Variants}
_{4. Novel Primitives / Abstractions}
_{5. Cryptographic Foundation}
_{6. Key Data Structures}
_{7. Database Encoding}
_{8. Protocol Phases}
_{9. Communication Breakdown}
_{10. Correctness Analysis}
_{11. Complexity}

_{12. Performance Benchmarks}
_{13. Application Scenarios}
_{14. Deployment Considerations}
_{15. Key Tradeoffs & Limitations}
_{16. Comparison with Prior Work}
_{17. Portable Optimizations}
_{18. Implementation Notes}
_{19. Open Problems}
_{20. Related Papers in Collection}
_{21. Uncertainties}

Field	Value
Paper	Hintless Single-Server Private Information Retrieval (2023)
Archetype	Construction + Building-block (LinPIR / composable preprocessing are novel primitives)
PIR Category	Group 1b — Stateless Client, Stateless Server
Security model	Semi-honest single-server CPIR
Additional assumptions	Circular security of RLWE (required for seed reuse across queries); Random Oracle Model (for public randomness expansion from seed)
Correctness model	Probabilistic (correctness error at most delta, configurable; see Lemma 19)
Rounds (online)	1 (non-interactive: client sends query, server returns response)
Record-size regime	Small to Moderate (8 B to 32 KB benchmarked; supports larger via column stacking)

Lineage ⤴

Field	Value
Builds on	SimplePIR / DoublePIR [Group 2a] (LWE-based PIR with client-independent hints); Spiral [Group 1a] (RLWE-based composition with Regev+GSW ciphertexts); Halevi-Shoup matrix-vector multiplication algorithm [31]
What changed	SimplePIR requires clients to download a large database-dependent hint (H = DB * A). HintlessPIR eliminates this hint by having the server homomorphically compute H * s via a new LinPIR sub-protocol (NTTlessPIR), using RLWE-based composable preprocessing to make the homomorphic computation efficient. This removes all offline client-server communication while maintaining throughput close to SimplePIR. ¹
Superseded by	YPIR [Group 1b] (2024, eliminates offline communication via CDKS ring packing and GPU acceleration); later Group 1b schemes (WhisPIR, InsPIRe, VIA) build on similar stateless paradigms
Concurrent work	Tiptoe PIR [33] — concurrent independent work that similarly removes the SimplePIR hint using RLWE, but with O(sqrt(m) + n^2) online communication (roughly two orders of magnitude larger than HintlessPIR for large databases) ²

Core Idea ⤴

HintlessPIR eliminates the database-dependent client hint from SimplePIR by replacing the local client computation c_0 = H * s (which requires the hint H = DB * A) with a secure server-side computation via a new Linear PIR (LinPIR) sub-protocol called NTTlessPIR. ³ The key technical innovation is homomorphic encryption with composable preprocessing: RLWE-based ciphertexts consist of a public-randomness part alpha(ct) (depending only on the random seed) and a secret-dependent part beta(ct). Since alpha is known in advance, the server can preprocess the expensive parts of all homomorphic operations (gadget products, key-switching, rotations) offline, leaving only cheap linear operations online. ⁴ This yields an O(log n) asymptotic speedup in the homomorphic computation. Combined with CRT decomposition to handle arbitrary moduli and RLWE slot packing, the scheme achieves throughput up to 60% of SimplePIR while requiring no offline communication, no client-side database-dependent state, and no server-side client-dependent state. ⁵

Variants ⤴

Variant	Key Difference	Offline Comm	Online Comm (asymptotic)	Best For
HintlessPIR	LWEPIR + NTTlessPIR composition; sqrt(m) bandwidth	O(1) (none)	O(sqrt(m) + n)	Most practical databases up to ~1 TB
TensorPIR	3D database; tensor product of two selection vectors	O(1) (none)	O(m^{1/3} + n)	Extremely large databases (>= 1 TB)

Novel Primitives / Abstractions ⤴

LinPIR (Linear PIR)

Field	Detail
Name	Linear PIR (LinPIR)
Type	Cryptographic primitive / PIR generalization
Interface / Operations	S.setup(DB) -> (hint_C, hint_S); C.query(a, C_hint) -> (qu, C_state); S.response(qu, S_hint) -> rsp; C.recover(C_state, rsp) -> sum_i a_i * DB_i mod p ⁶
Security definition	For all (i, j) in [m]^2, distributions of query(i) and query(j) are computationally indistinguishable (Definition 7, p.16) ⁷
Correctness definition	(1 - delta)-correct: Pr[C.recover != sum_i a_i * DB_i] <= delta for any database and any linear query a (Definition 6, p.16) ⁸
Purpose	Generalization of standard PIR where a client queries an arbitrary linear combination of records (not just a single record). Standard PIR is LinPIR where a = basis vector. Used to securely compute H * s without transmitting H.
Built from	RLWE-based linearly homomorphic encryption with composable preprocessing
Standalone complexity	NTTlessPIR: Server preprocessing O(k * ell * n * n_cols * log n); Server response time kn_cols(n_rows + n + (ell+2)n); Client upload kn + elln; Client download 2k ceil(n_rows/n) * n (Lemma 6, p.23) ⁹
Relationship to prior primitives	Strictly generalizes standard PIR. LWEPIR (the SimplePIR/FrodoPIR family) is a special case. Tiptoe PIR's LinPIR scheme uses column-major matrix-vector multiplication; NTTlessPIR uses diagonally-dominant decomposition. ¹⁰

Composable Preprocessing (alpha/beta Decomposition)

Field	Detail
Name	Homomorphic Encryption with Composable Preprocessing
Type	Optimization paradigm / cryptographic technique
Interface / Operations	For any HE operation F: Preproc_F(alpha(ek), alpha(ct)) -> g; Apply_F(g, ek, ct) -> result; Apply^alpha_F(g) -> alpha(result) (Equations 2-3, p.18) ¹¹
Security definition	Inherited from underlying RLWE security
Correctness definition	Apply_F produces the same output as Eval_F (the standard homomorphic evaluation)
Purpose	Decomposes any preprocessable HE operation into an expensive offline part (Preproc, depending only on public randomness alpha) and a cheap online part (Apply, linear operations). The key property is composability: alpha of the output depends only on g, enabling cascaded preprocessing across complex circuits. ¹²
Built from	Structural property of RLWE ciphertexts: ct = (a, b) where a = alpha(ct) is public randomness, b = beta(ct) is secret-dependent
Standalone complexity	Preproc_diamond: O(ell * n * log n) Z_q ops, output size (ell+1)n; Apply_diamond: (ell+1)n Z_q ops (Lemma 2, p.20). Preproc_ks: O(ell * n * log n); Apply_ks: (ell+2)n (Lemma 3, p.20). Preproc_rot^oR: O(ell * n * R * log n); Apply_rot^oR: (R-1)(ell+2)n (Lemma 4, p.21). ¹³
Relationship to prior primitives	Analogous to SimplePIR's hint precomputation (server precomputes DB * A), but generalized to the RLWE setting and applicable to arbitrary composable circuits of HE operations, not just matrix-vector multiplication. ¹⁴

Cryptographic Foundation ⤴

Layer	Detail
Hardness assumption	Decision LWE (for LWEPIR component) and Decision RLWE (for NTTlessPIR component) ¹⁵
Encryption/encoding scheme(s)	LWE encryption: Enc_s(m; seed) where A := RO(seed), b := A * s + e + Delta * m (Definition 1, p.14); RLWE encryption: Enc_v(m) = [a, a * v + e + Delta * m] (Definition 2, p.14) ¹⁶
Ring / Field	Z[X]/(X^n + 1) with n = 2^12 = 4096 (RLWE ring dimension); ciphertext modulus q approx 2^90 (product of NTT-friendly primes); plaintext moduli p_0, p_1 of 22 bits each for CRT decomposition ¹⁷
Key structure	LWE secret s sampled from chi_sigma^N (ternary distribution) with N = 1408; RLWE secret v sampled from chi_sigma^n. Fresh RLWE key v per query. A expanded from short seed via random oracle. ¹⁸
Correctness condition	Three conditions must simultaneously hold (Lemma 19, p.46): (1) Q > sqrt(n_cols) * p^2 * sigma * sqrt(ln(1 + n_rows/(delta/3))); (2) q > sqrt(ln(1 + k(n_rows+n)/(delta/3))) * max_j sqrt(ell) * N * n * sigma * gamma * p_j^2; (3) product_j p_j > Q * sigma * sqrt(N) * sqrt(ln(1 + kn * ceil(n_rows/n) / (delta/3))) ¹⁹

Key Data Structures ⤴

Database as matrix: DB in Z_p^{sqrt(m) x sqrt(m)} where m is the number of records. For records larger than one Z_p element, each record uses d > 1 LWE plaintext elements stacked vertically in a column of DB. ²⁰
Hint matrix H: H := DB * A in Z_Q^{sqrt(m) x N} — this is the SimplePIR hint. In HintlessPIR, H is never transmitted to the client; instead, H * s is computed homomorphically via NTTlessPIR.
Preprocessed data g_j: Server stores Preproc_{A_i}_i output for each CRT modulus p_j: size kn * n_cols * (ell + 1) elements of Z_q per modulus. This is computed once and reused across all queries. ²¹
RLWE ciphertexts (query): Two compressed RLWE ciphertexts (encoding s mod p_j) plus a compressed rotation key, totaling approximately 323 KB. ²²

Database Encoding ⤴

Representation: sqrt(m) x sqrt(m) matrix over Z_p, following SimplePIR's approach. For NTTlessPIR, the hint matrix H = DB * A is viewed as a database in Z_Q^{sqrt(m) x N} with n_cols = N <= n columns.
Record addressing: Linear index i decomposed as (i_0, i_1) = (i mod d_u, i / d_u) where d_u = sqrt(m). Client queries column i_1 of DB via LWEPIR and retrieves row i_0 of the result.
Preprocessing required: Server computes DB * A (one-time, client-independent) and Preproc for NTTlessPIR (one-time per seed). Both are reusable across all queries.
Record size equation: Each record maps to d = ceil(record_bits / log_2(p)) elements in Z_p. The LWE plaintext space is approximately 8-10 bits for databases up to 2^38 records. ²³

Protocol Phases ⤴

Phase	Actor	Operation	Communication	When / Frequency
Server Setup	Server	Compute H = DB * A; run NTTlessPIR.setup(H) to get preprocessed data g_j for each CRT modulus p_j; set LWEPIR hint to (0, seed)	--	Once per DB / per seed reseed
Query (LWEPIR)	Client	Decompose index i into (i_0, i_1); compute LWEPIR query: Enc_s(u_{i_0}; seed)	sqrt(m) elements of Z_Q upward	Per query
Query (NTTlessPIR)	Client	Sample fresh RLWE key v; compute Enc_v(encode(s mod p_j)) for each CRT modulus p_j, plus rotation key	(k + ell)n elements of Z_q + sqrt(m) elements of Z_Q upward	Per query
Response (LWEPIR)	Server	Compute DB * query (matrix-vector product)	sqrt(m) elements of Z_Q downward	Per query
Response (NTTlessPIR)	Server	For each p_j, compute Apply_{A_i}(g_j, rotated ciphertexts, hat{b}_j)	2k(sqrt(m) + n) elements of Z_q downward	Per query
Recover	Client	Decrypt NTTlessPIR response to get c_0 = H * s; then compute LWEPIR recovery: round((rsp, v_{i_1}) - c_0) / Delta	--	Per query

Communication Breakdown ⤴

Component	Direction	Size (concrete, 8 GB DB)	Reusable?	Notes
LWEPIR query	upward	~130 KB	No	sqrt(m) elements of Z_Q
NTTlessPIR query (RLWE ciphertexts + rotation key)	upward	~323 KB	No	k=2 CRT ciphertexts + compressed rotation key
Total query	upward	~453 KB	No	Per query
LWEPIR response	downward	~130 KB	No	sqrt(m) elements of Z_Q
NTTlessPIR response	downward	~2950 KB	No	2k * ceil(n_rows/n) * n elements of Z_q
Total response	downward	~3080 KB	No	Per query

Correctness Analysis ⤴

Option A: FHE Noise Analysis

The noise analysis uses sub-Gaussian and sub-Exponential parameter tracking under an independence heuristic. ²⁴

Phase	Noise parameter	Growth type	Notes
LWEPIR response	psi_2-norm at most sqrt(n_cols) * p * sigma	additive	Error in DB * (A * s + e) = DB * e (Lemma 16, p.42)
NTTlessPIR: rotation key generation	psi_2-norm at most sqrt(ell) * gamma * sigma^2 * n per key	multiplicative (via diamond-products)	From single ksk to n/2 rotation keys (Lemma 14, p.40)
NTTlessPIR: rotations of ciphertext	psi_2-norm at most sqrt(ell * n_cols) * n * sigma * gamma	cumulative from n_cols rotations	Each rotation adds sqrt(ell) * n * sigma * gamma noise (p.43)
NTTlessPIR: matrix-vector multiply	psi_2-norm at most max_j sqrt(ell * n_cols) * n * sigma * gamma * p_j	multiplicative (plaintext * rotated ct)	Under independence heuristic; error = sqrt(ell * n_cols) * n * sigma * gamma * max_j p_j (Lemma 17, p.43)
CRT interpolation	n/a — deterministic given correct NTTlessPIR decryption	--	Requires product_j p_j > n_cols * norm(DB)_inf * norm(m)_inf

Correctness condition: Three conditions simultaneously (Lemma 19, p.46): LWEPIR decryption, NTTlessPIR decryption, and CRT interpolation must each succeed with probability at least 1 - delta/3.
Independence heuristic used? Yes. The paper assumes intermediate values in homomorphic computations are independent, then validates empirically in Section 7. ²⁵
Dominant noise source: NTTlessPIR matrix-vector multiplication (the sum of n_cols plaintext-ciphertext products after rotation).

Complexity ⤴

Core metrics

Metric	Asymptotic	Concrete (2^30 x 1B = 1.07 GB)	Phase
Query size	O(sqrt(m) + n)	453 KB	Online
Response size	O(sqrt(m) + n)	3080 KB	Online
Server computation	nm + O_tilde(sqrt(m) * n)	613 ms (single-thread)	Online
Client computation	--	51.00 ms	Online (recovery)
Throughput	--	1750 MB/s (single-thread)	Online
Response overhead	O(1) constant factor > SimplePIR	~33x (current impl); ~9x with optimizations ²⁶	--

Preprocessing metrics

Metric	Asymptotic	Concrete (2^30 x 1B = 1.07 GB)	Phase
Server preprocessing	nm + O_tilde(sqrt(m) * n)	199.15 s	Offline (one-time per seed)
Server long-term storage	knN(ell + 1) + sqrt(m) * N elements	~proportional to DB size	Persistent
Client hint download	O(1)	0	N/A (hintless)
Client persistent storage	0	0	--

FHE-specific metrics

Metric	Asymptotic	Concrete	Phase
Multiplicative depth	0 (linear operations only)	0	--
CRT moduli count k	O(1)	2 (two 22-bit NTT-friendly primes)	--
Gadget size ell	ceil(log_z(q))	~3 (for q approx 2^90 with z-base decomposition) ²⁷	--

Performance Benchmarks ⤴

Hardware: AWS r7iz.4xlarge, Intel Sapphire Rapids CPUs at 3.00 GHz, 128 GB RAM, single-threaded (except Table 3). Compiled with clang 16, AVX-512. ²⁸

Table 1: Communication costs (exact, from Table 1 p.28)

Database	2^20 x 8B (8 MB)	2^20 x 256B (268 MB)	2^26 x 8B (537 MB)	2^30 x 1B (1.07 GB)	2^18 x 32KB (8.59 GB)
HintlessPIR Query	334 KB	388 KB	415 KB	453 KB	1502 KB
HintlessPIR Response	288 KB	1540 KB	2212 KB	3080 KB	3080 KB
SimplePIR Hint	16 MB	92 MB	131 MB	185 MB	185 MB
SimplePIR Query	12 KB	66 KB	93 KB	131 KB	1180 KB
SimplePIR Response	12 KB	66 KB	93 KB	131 KB	117 KB
Spiral Params	8 MB	8 MB	9 MB	9 MB	10 MB
Spiral Query	16 KB	16 KB	16 KB	16 KB	16 KB
Spiral Response	21 KB	21 KB	21 KB	21 KB	61 KB

Table 2: Computation costs (exact, from Table 2 p.29)

Database	2^20 x 8B (8 MB)	2^20 x 256B (268 MB)	2^26 x 8B (537 MB)	2^30 x 1B (1.07 GB)	2^18 x 32KB (8.59 GB)
HintlessPIR
Server Online Time	233 ms	385 ms	478 ms	613 ms	1347 ms
Server Throughput	35 MB/s	698 MB/s	1122 MB/s	1750 MB/s	6376 MB/s
Server Preproc. Time	3.11 s	51.57 s	93.58 s	199.15 s	2128 s
Client Recovery Time	4.78 ms	25.50 ms	36.66 ms	51.00 ms	52.32 ms
SimplePIR
Server Online Time	0.75 ms	28 ms	53 ms	105 ms	841 ms
Server Throughput	11201 MB/s	9675 MB/s	10046 MB/s	10268 MB/s	10213 MB/s
Server Preproc. Time	0.96 s	45.39 s	85.24 s	188.03 s	2116 s
Spiral
Server Online Time	691 ms	694 ms	1229 ms	2319 ms	12769 ms
Server Throughput	12 MB/s	387 MB/s	436 MB/s	463 MB/s	673 MB/s
Client Param Gen Time	139 ms	193 ms	326 ms	326 ms	326 ms

Table 3: Multi-threaded results (exact, from Table 3 p.30)

Database	2^20 x 8B (8 MB)	2^20 x 256B (268 MB)	2^26 x 8B (537 MB)	2^30 x 1B (1.07 GB)	2^18 x 32KB (8.59 GB)
HintlessPIR (4 threads) Online	115 ms	162 ms	193 ms	232 ms	418 ms
HintlessPIR (4 threads) Throughput	0.07 GB/s	1.66 GB/s	2.78 GB/s	4.62 GB/s	20.53 GB/s
SimplePIR (4 threads) Online	0.30 ms	7 ms	14 ms	27 ms	213 ms
SimplePIR (4 threads) Throughput	28.15 GB/s	38.78 GB/s	39.22 GB/s	40.36 GB/s	40.42 GB/s

Application Scenarios ⤴

Few-query clients: HintlessPIR is designed for settings where clients make only a few queries before the database updates, making hint amortization impractical. For a single initial query, HintlessPIR has lower total communication than SimplePIR (which must transmit its hint) and lower latency than Spiral (which must transmit evaluation keys). ²⁹
Anonymous queries: The absence of client-dependent server state means the server cannot link queries to clients, enabling anonymous access patterns. ³⁰
Dynamic databases: No client hints to invalidate when the database changes; only the server-side preprocessing must be updated. ³¹

Deployment Considerations ⤴

Database updates: Server must rerun preprocessing (DB * A and Preproc) when DB changes. Preprocessing cost is comparable to SimplePIR's hint generation (~199 s for 1 GB). ³²
Seed reseeding: Server must periodically reseed (regenerate the random oracle seed) every kappa = omega(log n) queries for security. This is amortized: T_amortize = T_preprocess/kappa + T_response = (1 + o(1)) * T_response. ³³
Anonymous query support: Yes (stateless, no client-dependent server state).
Session model: Ephemeral client — no persistent state required between queries.
Cold start suitability: Excellent — no offline communication required; a new client can immediately issue a query.
Amortization crossover vs SimplePIR: HintlessPIR has lower total communication until SimplePIR amortizes its hint over approximately 50-100 queries. Bandwidth advantage over Spiral holds for the first 3-5 queries. ³⁴

Key Tradeoffs & Limitations ⤴

Server throughput gap: HintlessPIR achieves up to 60% of SimplePIR's throughput and up to 9.4x higher than Spiral, but SimplePIR remains ~6x faster per query at large database sizes. The gap narrows as m grows. ³⁵
Response size overhead: Current implementation produces responses approximately 33x larger than SimplePIR. Unimplemented optimizations (Appendix E.1) could reduce this to approximately 9x. ³⁶
Circular security assumption: Required for RLWE seed reuse across queries. Standard RLWE does not suffice. ³⁷
Small databases: For very small databases (8 MB), HintlessPIR is slower than both SimplePIR and trivial download. The overhead of NTTlessPIR preprocessing and homomorphic computation dominates.
TensorPIR not practical for typical sizes: TensorPIR has better asymptotic communication O(m^{1/3} + n) but requires larger RLWE parameters (n >= 2^13, q >= 2^150) due to depth-2 computation, making it only competitive for databases >= 1 TB. ³⁸

Comparison with Prior Work ⤴

Single-query total communication (from Tables 1-2, exact)

Metric	HintlessPIR	SimplePIR	Tiptoe PIR	Spiral	DoublePIR
Total comm (1st query)	3533 KB	185 MB + 262 KB	23 MB + ~2.7 MB	9 MB + 37 KB	874 MB + 750 KB
Server online time	613 ms	105 ms	1890 ms	2319 ms	113 ms
Server throughput	1750 MB/s	10268 MB/s	568 MB/s	463 MB/s	9506 MB/s
Client recovery	51.00 ms	--	14.67 ms	--	--
Offline comm required?	No	Yes (hint)	No	Yes (params)	Yes (hint)
DB params	2^30 x 1B (1.07 GB)	same	same	same	same

Key takeaway: HintlessPIR is the optimal choice when clients make few queries (1-50) against a database that may change, and when anonymous access is required. For high-throughput bulk serving with amortization over many queries, SimplePIR remains faster. For minimum per-query communication with amortization, Spiral is better.

Portable Optimizations ⤴

Composable preprocessing (alpha/beta decomposition): Applicable to any RLWE-based protocol that uses gadget products and key-switching. The paper notes this could be applied to the RLWE expansion algorithm of [15] and potentially to other protocols beyond PIR. ³⁹
NTT-precomputation for diamond-products: Reduces diamond-product online cost from O(n log n) to O(n) Z_q operations by precomputing NTT(g^{-1}(iNTT(a_hat))_i). Applicable to any scheme using gadget-based key-switching.
CRT decomposition for arbitrary-modulus LinPIR: Running LinPIR mod several NTT-friendly primes p_j and using CRT reconstruction lets the scheme handle moduli Q that are not NTT-friendly. Applicable to any RLWE-based scheme that needs to operate over non-NTT moduli.
RLWE slot packing for matrix-vector multiplication: Packing two copies of s in the 4096 RLWE slots reduces the number of rotations from n_cols - 1 to 511 and halves the ciphertext-plaintext multiplications. ⁴⁰

Implementation Notes ⤴

Language / Library: C++ (custom implementation, not built on SEAL or OpenFHE). Uses RNS variant of BFV for the NTTlessPIR component. ⁴¹
Polynomial arithmetic: NTT-based (all computation in NTT domain after preprocessing).
CRT decomposition: Q = 2^32 (LWE ciphertext modulus); RLWE modulus q = product of NTT-friendly primes totaling ~90 bits; plaintext moduli p_0, p_1 of 22 bits each.
SIMD / vectorization: AVX-512 (Intel Sapphire Rapids). ⁴²
Parallelism: Single-threaded for main benchmarks; 4-thread results in Table 3. NTTlessPIR's first step (rotations) parallelized to 2 threads (one per plaintext modulus p_j); second step (matrix-vector multiply) parallelized to 4 threads. ⁴³
Open source: https://github.com/google/hintless_pir [^44]

Open Problems ⤴

Extending composable preprocessing to the GHS variant of key-switching [25], which the authors note is nontrivial. ⁴⁴
Applying composable preprocessing to other protocols beyond PIR (e.g., FHE-based general computation).
Reducing the response size overhead from approximately 33x (current) toward the approximately 9x theoretical minimum identified in Appendix E.1.

Uncertainties ⤴

Variable "n" overloading: The paper uses n for the RLWE ring dimension (n = 2^12 = 4096) and N for the LWE secret dimension (N = 1408). These are consistent throughout but differ from some other papers (e.g., SimplePIR uses n for LWE dimension). The asymptotic comparison in Figure 1 uses n for LWE secret dimension and m for database size.
Gadget parameters: The exact gadget quality gamma and gadget size ell are not explicitly stated for the concrete parameter choices. From q approx 2^90, ell is inferred to be approximately 3.
Packing details: The paper mentions packing two diagonals of each 1024 x 2048 block of H but the exact slot layout and its interaction with CRT decomposition could benefit from more explicit specification.
Response size constant factor: The paper reports approximately 33x response overhead vs SimplePIR in the current implementation but projects approximately 9x with optimizations (Appendix E.1). Neither figure directly maps to the asymptotic O(1) constant in Figure 1.

Abstract (p.1): "Our first construction (HintlessPIR) eliminates the client preprocessing step from the recent LWE-based SimplePIR... by outsourcing the 'hint' related computation to the server, leveraging a new concept of homomorphic encryption with composable preprocessing." ↩
Section 1, Technical Contributions (p.9): "Note that while Tiptoe PIR is similarly 'hintless', its per-query communication is asymptotically worse (O(sqrt(m) + n^2) compared to O(sqrt(m) + n)), yielding concretely worse per-query bandwidth (by two orders of magnitude) than HintlessPIR, while also being slower by constant factors." ↩
Section 1, "HintlessPIR" (p.5): "The starting point for our first construction is the SimplePIR construction. This arranges the database as a square matrix... One can therefore encrypt the selection vector u_i, and homomorphically multiply by the database to obtain a PIR scheme." ↩
Section 3 (p.18): "The main idea is that RLWE-based ciphertexts consists of two parts (a, b) where a is some public randomness that does not depend on the encrypted message, and is often available in advance... the public randomness of the ciphertexts output by homomorphic operations often depends only on the public randomness of the input." ↩
Section 8, Conclusion (p.31): "we were able to achieve concretely fast server processing time in our first construction, HintlessPIR, namely up to 60% of the throughput of Simple PIR, and up to 9.4x higher throughput than Spiral PIR." ↩
Definition 5 (p.16): "A Single-Server LinPIR Scheme with Preprocessing is a tuple of four algorithms..." ↩
Definition 7 (p.16): "A Single-Server LinPIR scheme with Preprocessing is said to be secure if for all (i, j) in [m]^2, the distributions of query(i) and query(j) are computationally indistinguishable." ↩
Definition 6 (p.16): "A Single-Server LinPIR scheme with Preprocessing Scheme is said to be (1 - delta)-correct if for any database DB..." ↩
Lemma 6 (p.23): Complexity analysis of NTTlessPIR. ↩
Section 2.5, "Tiptoe PIR" (p.17): "The main difference between this LinPIR scheme and the scheme we develop in Section 4 (NTTlessPIR) is that Tiptoe PIR utilizes the column-major (rather than diagonally-dominant) matrix-vector multiplication algorithm." ↩
Section 3 (p.18): "Eval_F(ek, ct) = Apply_F(Preproc_F(alpha(ek), alpha(ct)), ek, ct)" and "alpha(Apply_F(g, ek, ct)) = Apply^alpha_F(g)." ↩
Section 3, Equation 3 (p.18): "the public randomness component of the output... depends only on the result of the preprocessing g, and in particular, it can be computed in advance." ↩
Lemmas 2, 3, 4 (pp.20-21): Preprocessing complexities for diamond-product, key-switching, and batched rotations respectively. ↩
Section 1, "Homomorphic Encryption with Composable Preprocessing" (p.7): "The high-level idea behind homomorphic encryption with composable preprocessing is similar to SimplePIR, albeit in the setting of RLWE-based encryption, and for a wider class of computations." ↩
Lemma 18, Appendix F (p.46): "Let (N, Q, sigma) be such that LWE is hard. Let (n, q, sigma) be such that RLWE is hard, and moreover assume that RLWE is circular secure. Then HintlessPIR is a secure PIR with preprocessing scheme." ↩
Definitions 1-2 (p.14): LWE and RLWE encryption definitions with seed-based public randomness. ↩
Section 7.1 (pp.26-27): "We set N = 1408, ciphertext modulus Q = 2^32... We set our RLWE parameters as n = 2^12, ciphertext modulus q approx 2^90, and error standard deviation 3.2... we choose two NTT-friendly plaintext moduli p_0, p_1 of 22 bits each for CRT decomposing H and s." ↩
Section 7.1 (pp.26-27): "sample the LWE secret key from the uniform ternary distribution." ↩
Lemma 19, Appendix F (p.46): Correctness conditions for HintlessPIR combining LWEPIR, NTTlessPIR decryption, and CRT interpolation failure probabilities. ↩
Section 7.2 (p.27): "For large database records, we follow the suggestion in [34] to encode each record using d > 1 LWE plaintext elements and vertically stack them in a column of the database matrix DB." ↩
Lemma 7 (p.24): "Server Long-term Storage: knN(ell + 1) elements of Z_q and sqrt(m) * N elements of Z_Q." ↩
Section 7.2 (p.27): "a NTTlessPIR query is 323KB (which includes two compressed ciphertexts and a compressed rotation key)." ↩
Section 7.2 (p.27): "The LWE plaintext space is about 8 to 10 bits for databases up to 2^38 records." ↩
Appendix A (p.37-38): "We will use the independence heuristic, or the heuristic assumption that intermediate values within homomorphic computations are independent, and therefore one may apply pythagorean additivity in all situations." ↩
Appendix A (p.38): "the noise bounds we derive will be heuristic -- we will validate them against our implementation in Section 7." ↩
Section 1, footnote 10 (p.9): "In our current implementation, this constant factor is somewhat large -- approx 33x larger. We discuss several optimizations that would reduce this to approx 9x larger in Appendix E.1." ↩
Section 7.1 (p.27): RLWE parameters n = 2^12, q approx 2^90, implying ell = ceil(90/30) = 3 with 30-bit gadget decomposition base. ↩
Section 7.2 (p.27): "We ran our server program on an AWS r7iz.4xlarge instance with Intel Sapphire Rapids CPUs running at 3.00GHz and with 128GB RAM. We took advantage of the SIMD instruction sets such as AVX-512, and compiled our test program using clang 16." ↩
Section 7.2 (p.28-29): "One of the advantages of HintlessPIR is the absence of offline interaction between the client and the server, which makes it very appealing for situations where the client only makes a few queries before the database is updated." ↩
Section 1, "The problem with preprocessing" (p.4): "if the server requires client-dependent state, the server must have knowledge of this information for correct protocol execution... does not provide anonymity regarding which client is querying the database at which time." ↩
Section 1, "The problem with preprocessing" (p.3): "When the database is updated, all hints need to be recomputed to maintain correctness across all clients." ↩
Table 2 (p.29): Server preprocessing time for HintlessPIR at 1.07 GB is 199.15 s vs SimplePIR's 188.03 s. ↩
Appendix E, after Lemma 11 (p.42): "we set kappa minimal such that the amortized cost of preprocessing disappears (asymptotically). For NTTlessPIR, one can check that this is kappa = omega(log n)." ↩
Section 1, "Implementation" (p.11): "We find that our protocol has lower bandwidth until one is able to reuse a hint for approx 50 to 100 SimplePIR queries to the same database, and our bandwidth advantage over Spiral holds for the first 3 to 5 queries." ↩
Section 8 (p.31): "up to 60% of the throughput of Simple PIR, and up to 9.4x higher throughput than Spiral PIR." ↩
Appendix E.1 (pp.43-45): Three optimizations are discussed: packing (implemented, gives constant-factor speedup), reducing response size by half (sending only beta component), and lossily compressing RLWE ciphertexts. ↩
Lemma 18, Appendix F (p.46): "moreover assume that RLWE is circular secure." ↩
Section 7.3 (p.31): "We estimate that for databases of size 2^40 with 1 byte records that TensorPIR has server running time close to HintlessPIR and with smaller communication cost." ↩
Section 8, Conclusion (p.32): "It seems very interesting to extend such technique to additional homomorphic operations and constructions." ↩
Section 7.1 (p.27): "Our RLWE parameters provide 4096 slots in each plaintext polynomial, so we pack two copies of s in the query ciphertexts... this packing strategy reduces the number of rotations to 511, and reduces the number of ciphertext-plaintext multiplications by half." ↩
Section 7.1 (p.27): "Our NTTlessPIR scheme implementation is based on the Residue Number System (RNS) variant of Brakerski/Fan-Vercauteren (BFV) FHE scheme that supports linear homomorphic operations." ↩
Section 7.2 (p.27): "We took advantage of the SIMD instruction sets such as AVX-512." ↩
Section 7.2 (p.30): "the first step of NTTlessPIR's online algorithm is distributed to two threads, one per plaintext modulus, and the second step to four threads." ↩
Section 8, Conclusion (p.32): "It seems nontrivial to apply this technique to the GHS variant of key-switching." ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HintlessPIR — Engineering Notes

Lineage ⤴

Core Idea ⤴

Variants ⤴

Novel Primitives / Abstractions ⤴

LinPIR (Linear PIR)

Composable Preprocessing (alpha/beta Decomposition)

Cryptographic Foundation ⤴

Key Data Structures ⤴

Database Encoding ⤴

Protocol Phases ⤴

Communication Breakdown ⤴

Correctness Analysis ⤴

Option A: FHE Noise Analysis

Complexity ⤴

Core metrics

Preprocessing metrics

FHE-specific metrics

Performance Benchmarks ⤴

Table 1: Communication costs (exact, from Table 1 p.28)

Table 2: Computation costs (exact, from Table 2 p.29)

Table 3: Multi-threaded results (exact, from Table 3 p.30)

Application Scenarios ⤴

Deployment Considerations ⤴

Key Tradeoffs & Limitations ⤴

Comparison with Prior Work ⤴

Single-query total communication (from Tables 1-2, exact)

Portable Optimizations ⤴

Implementation Notes ⤴

Open Problems ⤴

Related Papers in Collection ⤴

Uncertainties ⤴

FilesExpand file tree

HintlessPIR_2023_notes.md

Latest commit

History

HintlessPIR_2023_notes.md

File metadata and controls

HintlessPIR — Engineering Notes

Lineage ⤴

Core Idea ⤴

Variants ⤴

Novel Primitives / Abstractions ⤴

LinPIR (Linear PIR)

Composable Preprocessing (alpha/beta Decomposition)

Cryptographic Foundation ⤴

Key Data Structures ⤴

Database Encoding ⤴

Protocol Phases ⤴

Communication Breakdown ⤴

Correctness Analysis ⤴

Option A: FHE Noise Analysis

Complexity ⤴

Core metrics

Preprocessing metrics

FHE-specific metrics

Performance Benchmarks ⤴

Table 1: Communication costs (exact, from Table 1 p.28)

Table 2: Computation costs (exact, from Table 2 p.29)

Table 3: Multi-threaded results (exact, from Table 3 p.30)

Application Scenarios ⤴

Deployment Considerations ⤴

Key Tradeoffs & Limitations ⤴

Comparison with Prior Work ⤴

Single-query total communication (from Tables 1-2, exact)

Portable Optimizations ⤴

Implementation Notes ⤴

Open Problems ⤴

Related Papers in Collection ⤴

Uncertainties ⤴

Footnotes