Skip to content

Conversation

@eitanporat
Copy link
Contributor

@eitanporat eitanporat commented Oct 21, 2025

Background

Nonogram is a logic puzzle. You are given a grid you need to color in black and white and "hints" - which indicate the run lengths.

For example, a clue of "4 8 3" would mean there are sets of four, eight, and three filled squares, in that order, with at least one blank square between successive sets.
image

This game is hard mathematically (it is "NP-HARD" meaning that we don't expect to find an algorithm that solves it completely but we have good heuristics)

The wikipedia article has a lot of nice techniques.

Environment

There are two types of actions coloring white and coloring black. I also experimented with allow toggle on and off, but I found the agent getting stuck in loops and I wanted to force it to see new boards).

I implemented two modes of the environment

One I call easy_learn (in practice it is hard to play for humans) in which a wrong move leads to an instant death. There are some ambiguity for randomly generated boards but I think the signal is strong regardless.

The other one has a couple of heuristics to check if the board is correct / incorrect

  • if a play places a black square in a row without hints it's incorrect
  • if a row has 10 black squares and the total of clues is 10 at the player colors another black it's incorrect
  • If a player completes a row (as in the total equals the total of the clues) and the pattern doesn't match it's incorrect
  • If for example a clue is "4 5 8" and color a black square creates 9 consecutive black squares it's incorrect.

This mode is harder to learn in practice.

I also give negative rewards for invalid moves and positive rewards if the player completes a row or a column or completes the board (a larger reward for completing the board).

Neural network

I tried many different kinds of architectures. In the end, I decided on encoding the board using a CNN the clues using a dense architecture separately (rows and columns) and also encoding the board size.

The grid has 4 types of cells (padding, empty, black, white) and clues have theoretically 0-8 (so 9 types).

In practice I statically put the maximum board size at 8.

Curriculum Learning

I do two types of randomization

  1. random board size
  2. I sample each black or white based on a probability p (sampled uniformly from 0 to 1)

I ran my best policy against different board sizes
It can beat all sizes up until 5x5, and 70% of 6x6 boards.

image

eitanporat and others added 6 commits October 21, 2025 18:50
- Add Nonogram C implementation (nonogram.c, nonogram.h, binding.c)
- Add Python wrapper (nonogram.py)
- Add Tetris-style neural network architecture for Nonogram in torch.py
  - Multi-layer CNN for grid with spatial downsampling
  - Separate encoders for row clues, column clues, and board size
  - No weight sharing between row and column encoders
- Add configuration file (nonogram.ini)
- Register NonogramLSTM and Nonogram policy classes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant