More Scanned Pages

Per request, I have printed three pages in succession and scanned them in, with a large area of whitespace on the sheets. I cropped the images to try and save a bit of space to eliminate areas that are not useful for reversing the watermark.

The images can be downloaded using this torrent.

After poking around at the images a bit, I realized I had to add more LEDs and clean the scanner glass to get good fidelity images. Pages 14 and 15 were scanned using the improved system, I think you will see a marked improvement over page 13…


Above is a swatch of watermark from page 14. This is off of printer serial number CNBC55L0QJ, formatter number F9221M9. The above image was processed in Photoshop with the following sequence: high-pass filter radius = 6.6; flatten channels; convert to grayscale; contrast +60%; brightness +20%. Does anyone know if gimp can automate tasks like this with a script?

7 Responses to “More Scanned Pages”

  1. modrobert says:

    I’m not sure if this helps, but here is page 14 in ASCII binary format:

    01001000000 00000 01
    10000100000 00000 10
    01000000001 00000 01
    00000101000 10100 00
    01000000001 00000 01
    00000000010 00001 00
    00000000100 00000 00
    00000001000 10000 00
    00001010101 00000 00
    00000000010 00001 00
    01000000001 00000 01
    00000100000 10000 00
    00000000100 01000 00
    00010000010 10100 00

    00010000000 00000 00
    00110000000 00000 00
    00001000000 01000 00
    10000000010 00000 10
    00000010000 00010 00
    00000001000 00001 00

    00000100000 00001 00
    01001000000 00000 01
    10000100000 00000 10
    01000000001 00000 01

    If the tt html tag fails in this comment, load the text file directly here:

    http://tlords.kicks-ass.net/watermark/page14_bitwise.txt

    This was simply done in gimp by manually making a grid over the dots on one image and using a color level adjusted image as reference.

    http://tlords.kicks-ass.net/watermark/watermark_to_ascii_binary_method.jpg

    I left the “gaps” as spaces, but maybe it should be represented as a line of zeroes instead. Regardless if you read this horisontally or vertically, it looks like the few ending bits could be some kind of checksum.

  2. Nickk says:

    Yes it can be automated in GIMP; IIRC it has a Scheme based scripting language called Script-Fu.

    I think it’s awesome that you’re doing this.

    Couldn’t we completely jam this watermarking method by manipulating values in the image before it’s communicated to the printer? If we can add spurious yellow marks at the driver level it might be the easiest method to defeat this thing.

  3. DC says:

    modrobert – from my look @ the pattern, you’ve got two extra columns and 3 extra lines ;)

    Oh, and I used the same general idea, but gimp’s make grid is fairly handy for generating the 36×36 pixel placement grid.

    In my grid I see the same blank rows, but the last blank column has a bit in it. The bit is pretty dubious as its real faint, but I’m gonna look at these 3 pages to see if I can find any more correlations.

    Tonight or tomorrow I’m gonna try and write a program based on imagemagick to analyze these files and autolocate the bits. Gimp is pretty slow when dealing with the full files.

    Bunnie – when you get those eeprom sockets I would be interested in seeing traces to and from the backend eeprom on powerup / on page print. I suspect that page print activity you’re seeing is it possibly updating a page count inside the eeprom, which would explain differences between pages. An FPGA based sniffer perhaps?

    I aslo volunteer to write the verilog for that, assuming you don’t already have something ready to go.

    Anyways, I think I see a pattern in that last column of 5, but my brain is too tired to get it.

    Cheers,

    -DC

  4. DC says:

    oh, and modrobert, there is a transcription error in your copy.. those dots hide right behind the lines ;). My solution was to put the dots in the center between the lines..

    I’m too tired now to be sure that my transcription is accurate but I’m sure theres one thats wrong in yours.

    The first 00000000100 00000 -> 00000000100 00010.

    One other thing, this ones placed at ~26×26, rather than the previous 36×36… is it just scaling? or was it scanned at a different res? or did it print differently?

    Later,

    -DC

  5. modrobert says:

    DC,

    I considered that dot in 00000000100 000[?]0 00 a bit vague to qualify, looks sharp in the level adjusted image, but check the original page 14 jpeg heading this story.

    http://bunniestudios.com/blog/images/watermark_swatch.jpg

    The data could be read vertically, horizontally, mirrored or as is, so I pasted together this image where you can see the bits from different views.

    http://tlords.kicks-ass.net/watermark/page14_bitwise_different_views.gif

    This might be useful when searching for a bit pattern to match serial number etc.

  6. modrobert says:

    I found these interesting pdf’s from HP…

    “A Transform Domain Hardcopy Watermarking Scheme”
    http://www.hpl.hp.com/techreports/2001/HPL-2001-309.pdf

    “Watermarking of Dither Halftoned Images”
    http://www.hpl.hp.com/techreports/98/HPL-98-32.pdf

    This is a research pdf titled “Watermarking Printed Images” from Purdue University supported by the Hewlett-Packard company.

    http://www.ima.umn.edu/talks/workshops/2-12-16.2001/kacker/kacker.pdf

    If the links above fail you can find these documents mirrored here:

    http://tlords.kicks-ass.net/watermark/

  7. plastic says:

    Oh,what a beautiful blog! I like it very much! I’m agreeable to your point of view!