Thứ Năm, 31 tháng 3, 2016

http://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy

up vote33down voteaccepted
  1. fix DPI (if needed) 300 DPI is minimum
  2. fix text size (e.g. 12 pt should be ok)
  3. try to fix text lines (deskew and dewarp text)
  4. try to fix illumination of image (e.g. no dark part of image
  5. binarize and de-noise image
There is no universal command line that would fit to all cases (sometimes you need to blur and sharpen image). But you can give a try to TEXTCLEANER from Fred's ImageMagick Scripts.
If you are not fan of command line, maybe you can try to use opensource scantailor.sourceforge.net or commercial bookrestorer.



I am by no means an OCR expert. But I this week had need to convert text out of a jpg.
I started with a colorized, RGB 445x747 pixel jpg. I immediately tried tesseract on this, and the program converted almost nothing. I then went into GIMP and did the following. image>mode>grayscale image>scale image>1191x2000 pixels filters>enhance>unsharp mask with values of radius = 6.8, amount = 2.69, threshold = 0 I then saved as a new jpg at 100% quality.
Tesseract then was able to extract all the text into a .txt file
Gimp is your friend.

Chủ Nhật, 27 tháng 3, 2016

how to make a range of colors in image transparent java

http://stackoverflow.com/questions/665406/how-to-make-a-color-transparent-in-a-bufferedimage-and-save-as-png

import java.awt.*;
import java.awt.image.BufferedImage;
import java.awt.image.FilteredImageSource;
import java.awt.image.ImageFilter;
import java.awt.image.ImageProducer;
import java.awt.image.RGBImageFilter;
import java.io.*;

import javax.imageio.ImageIO;

public class AddTransparency
{
  AddTransparency() throws IOException
  {
    String imagePath = "E:/Documents/images/";
    File inFile = new File(imagePath, "map.png");
    BufferedImage image = ImageIO.read(inFile);

    Image transpImg1 = TransformGrayToTransparency(image);
    BufferedImage resultImage1 = ImageToBufferedImage(transpImg1, image.getWidth(), image.getHeight());

    File outFile1 = new File(imagePath, "map_with_transparency1.png");
    ImageIO.write(resultImage1, "PNG", outFile1);

    Image transpImg2 = TransformColorToTransparency(image, new Color(0, 50, 77), new Color(200, 200, 255));
    BufferedImage resultImage2 = ImageToBufferedImage(transpImg2, image.getWidth(), image.getHeight());

    File outFile2 = new File(imagePath, "map_with_transparency2.png");
    ImageIO.write(resultImage2, "PNG", outFile2);
  }

  private Image TransformGrayToTransparency(BufferedImage image)
  {
    ImageFilter filter = new RGBImageFilter()
    {
      public final int filterRGB(int x, int y, int rgb)
      {
        return (rgb << 8) & 0xFF000000;
      }
    };

    ImageProducer ip = new FilteredImageSource(image.getSource(), filter);
      return Toolkit.getDefaultToolkit().createImage(ip);
  }

  private Image TransformColorToTransparency(BufferedImage image, Color c1, Color c2)
  {
    // Primitive test, just an example
    final int r1 = c1.getRed();
    final int g1 = c1.getGreen();
    final int b1 = c1.getBlue();
    final int r2 = c2.getRed();
    final int g2 = c2.getGreen();
    final int b2 = c2.getBlue();
    ImageFilter filter = new RGBImageFilter()
    {
      public final int filterRGB(int x, int y, int rgb)
      {
        int r = (rgb & 0xFF0000) >> 16;
        int g = (rgb & 0xFF00) >> 8;
        int b = rgb & 0xFF;
        if (r >= r1 && r <= r2 &&
            g >= g1 && g <= g2 &&
            b >= b1 && b <= b2)
        {
          // Set fully transparent but keep color
          return rgb & 0xFFFFFF;
        }
        return rgb;
      }
    };

    ImageProducer ip = new FilteredImageSource(image.getSource(), filter);
      return Toolkit.getDefaultToolkit().createImage(ip);
  }

  private BufferedImage ImageToBufferedImage(Image image, int width, int height)
  {
    BufferedImage dest = new BufferedImage(
        width, height, BufferedImage.TYPE_INT_ARGB);
    Graphics2D g2 = dest.createGraphics();
    g2.drawImage(image, 0, 0, null);
    g2.dispose();
    return dest;
  }

  public static void main(String[] args) throws IOException
  {
    AddTransparency at = new AddTransparency();
  }
}

Thứ Bảy, 26 tháng 3, 2016

select image to insert tinymce

tinyMCE.init({
            // General options
            mode: "textareas",
            theme: "advanced",
            plugins: "autolink,lists,pagebreak,style,layer,table,save,advhr,advimage,advlink,emotions,iespell,inlinepopups,insertdatetime,preview,media,searchreplace,print,contextmenu,paste,directionality,fullscreen,noneditable,visualchars,nonbreaking,xhtmlxtras,template,wordcount,advlist,autosave,visualblocks",
            // Theme options
            theme_advanced_buttons1: "|,bold,italic,underline,strikethrough,|,justifyleft,justifycenter,justifyright,justifyfull,styleselect,formatselect,fontselect,fontsizeselect",
            theme_advanced_buttons2: "cut,copy,paste,pastetext,pasteword,|,search,replace,|,bullist,numlist,|,outdent,indent,blockquote,|,undo,redo,|,link,unlink,anchor,image,cleanup,help,code,|,insertdate,inserttime,preview,|,forecolor,backcolor",
            theme_advanced_buttons3: "tablecontrols,|,hr,removeformat,visualaid,|,sub,sup,|,charmap,emotions,iespell,media,advhr,|,print,|,ltr,rtl,|,fullscreen",
            theme_advanced_buttons4: "insertlayer,moveforward,movebackward,absolute,|,styleprops,|,cite,abbr,acronym,del,ins,attribs,|,visualchars,nonbreaking,template,pagebreak,restoredraft,visualblocks",
            theme_advanced_toolbar_location: "top",
            theme_advanced_toolbar_align: "left",
            theme_advanced_statusbar_location: "bottom",
            theme_advanced_resizing: true,
            // Example content CSS (should be your site CSS)
            content_css: "css/content.css",
            // Drop lists for link/image/media/template dialogs
            template_external_list_url: "lists/template_list.js",
            external_link_list_url: "lists/link_list.js",
            external_image_list_url: "/ajax/am/get-list-image.js",
            media_external_list_url: "lists/media_list.js",
            // Style formats
            style_formats: [
                {title: 'Bold text', inline: 'b'},
                {title: 'Red text', inline: 'span', styles: {color: '#ff0000'}},
                {title: 'Red header', block: 'h1', styles: {color: '#ff0000'}},
                {title: 'Example 1', inline: 'span', classes: 'example1'},
                {title: 'Example 2', inline: 'span', classes: 'example2'},
                {title: 'Table styles'},
                {title: 'Table row 1', selector: 'tr', classes: 'tablerow1'}
            ],
//            document_base_url: "/static/image/product/",
convert_urls: false,
            height: 450,
            width: 730
        });

/ajax/am/get-list-image.js

var tinyMCEImageList = new Array(
["/static/image/product/338.jpg", "/static/image/product/338.jpg"],
["/static/image/product/339.jpg", "/static/image/product/339.jpg"],
["/static/image/product/thumb_645.jpg", "/static/image/product/thumb_645.jpg"],
["thumb_646.jpg", "/static/image/product/thumb/thumb_646.jpg"]);




application/x-httpd-php get-list-image.php ( PHP script text )

<?php

// You can't simply echo everything right away because we need to set some headers first!
$output = ''; // Here we buffer the JavaScript code we want to send to the browser.
$delimiter = "\n"; // for eye candy... code gets new lines
$output .= 'var tinyMCEImageList = new Array(';

$directory = APPLICATION_PATH . "/static/image/product"; // Use your correct (relative!) path here

function dirToArray($dir) {

    $result = array();

    $cdir = scandir($dir);
    foreach ($cdir as $key => $value) {
        if (!in_array($value, array(".", ".."))) {
            if (is_dir($dir . DIRECTORY_SEPARATOR . $value)) {
                $result[$value] = dirToArray($dir . DIRECTORY_SEPARATOR . $value);
            } else {
                $result[] = $value;
            }
        }
    }

    return $result;
}

$list_image = (dirToArray($directory));
//foreach ($list_image as $image) {
////    echo $image."<br/>";
//    $output .= $delimiter
//            . '["'
//            . $image
//            . '", "'
//            . '\static\image\product\\thumb' . $image
//            . '"],';
//}
$total = count($list_image);
for ($i = 0; $i < $total - 1; $i++) {
    $output .= $delimiter
            . '["'
            . '/static/image/product/'. $list_image[$i]
            . '", "'
            . '/static/image/product/'. $list_image[$i]
            . '"],';
}
$output .= $delimiter
        . '["'
        . $list_image[$total - 1]
        . '", "'
        . '/static/image/product/thumb/' . $list_image[$total - 1]
        . '"]';
// Finish code: end of array definition. Now we have the JavaScript code ready!
$output .= ');';

// Make output a real JavaScript file!
header('Content-type: text/javascript'); // browser will now recognize the file as a valid JS file
// prevent browser from caching
header('pragma: no-cache');
header('expires: 0'); // i.e. contents have already expired
// Now we can send data to the browser because all headers have been set!
echo $output;






Thứ Tư, 23 tháng 3, 2016

Latency Numbers Every Programmer Should Know

https://gist.github.com/jboner/2841832#file-latency-txt

Latency Comparison Numbers
--------------------------
L1 cache reference                           0.5 ns
Branch mispredict                            5   ns
L2 cache reference                           7   ns                      14x L1 cache
Mutex lock/unlock                           25   ns
Main memory reference                      100   ns                      20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy             3,000   ns        3 us
Send 1K bytes over 1 Gbps network       10,000   ns       10 us
Read 4K randomly from SSD*             150,000   ns      150 us          ~1GB/sec SSD
Read 1 MB sequentially from memory     250,000   ns      250 us
Round trip within same datacenter      500,000   ns      500 us
Read 1 MB sequentially from SSD*     1,000,000   ns    1,000 us    1 ms  ~1GB/sec SSD, 4X memory
Disk seek                           10,000,000   ns   10,000 us   10 ms  20x datacenter roundtrip
Read 1 MB sequentially from disk    20,000,000   ns   20,000 us   20 ms  80x memory, 20X SSD
Send packet CA->Netherlands->CA    150,000,000   ns  150,000 us  150 ms

Thứ Sáu, 4 tháng 3, 2016

Java code for WGS84 to Google map position and back

java 

static void fromWebMercatorToGeographic(double mercatorX_lon, double mercatorY_lat
    ) {
        if (Math.abs(mercatorX_lon) < 180 && Math.abs(mercatorY_lat) < 90) {
            return;
        }

        if ((Math.abs(mercatorX_lon) > 20037508.3427892) || (Math.abs(mercatorY_lat) > 20037508.3427892)) {
            return;
        }

        double x = mercatorX_lon;
        double y = mercatorY_lat;
        double num3 = x / 6378137.0;
        double num4 = num3 * 57.295779513082323;
        double num5 = Math.floor((double) ((num4 + 180.0) / 360.0));
        double num6 = num4 - (num5 * 360.0);
        double num7 = 1.5707963267948966 - (2.0 * Math.atan(Math.exp((-1.0 * y) / 6378137.0)));
        mercatorX_lon = num6;
        mercatorY_lat = num7 * 57.295779513082323;
        System.out.println("mercatorY_lat=" + mercatorY_lat);
        System.out.println("mercatorX_lon=" + mercatorX_lon);
    }

I ported this to PHP - here's the code, if anyone would need it:
To mercator:
$lon = ($lon * 20037508.34) / 180;
$lat = log(tan((90 + $lat) * M_PI / 360)) / (M_PI / 180);
$lat = $lat * 20037508.34 / 180;
From mercator:
$lon = ($lon / 20037508.34) * 180;
$lat = ($lat / 20037508.34) * 180;
$lat = 180/M_PI * (2 * atan(exp($lat * M_PI / 180)) - M_PI / 2);

Here are the functions in JavaSCript ... As extracted from OpenLayers
function toMercator (lon, lat) {
  var x = lon * 20037508.34 / 180;
  var y = Math.log(Math.tan((90 + lat) * Math.PI / 360)) / (Math.PI / 180);
  y = y * 20037508.34 / 180;

  return [x, y];
  }

function inverseMercator (x, y) {
  var lon = (x / 20037508.34) * 180;
  var lat = (y / 20037508.34) * 180;

  lat = 180/Math.PI * (2 * Math.atan(Math.exp(lat * Math.PI / 180)) - Math.PI / 2);

  return [lon, lat];
  }
Fairly straightforward to convert to Java
http://stackoverflow.com/questions/7661/java-code-for-wgs84-to-google-map-position-and-back

private void FromWebMercatorToGeographic(ref double mercatorX_lon, ref double mercatorY_lat)
{
    if (Math.Abs(mercatorX_lon) < 180 && Math.Abs(mercatorY_lat) < 90)
        return;

    if ((Math.Abs(mercatorX_lon) > 20037508.3427892) || (Math.Abs(mercatorY_lat) > 20037508.3427892))
        return;

    double x = mercatorX_lon;
    double y = mercatorY_lat;
    double num3 = x / 6378137.0;
    double num4 = num3 * 57.295779513082323;
    double num5 = Math.Floor((double)((num4 + 180.0) / 360.0));
    double num6 = num4 - (num5 * 360.0);
    double num7 = 1.5707963267948966 - (2.0 * Math.Atan(Math.Exp((-1.0 * y) / 6378137.0)));
    mercatorX_lon = num6;
    mercatorY_lat = num7 * 57.295779513082323;
}
http://stackoverflow.com/questions/11957538/converting-geographic-wgs-84-to-web-mercator-102100