<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" ><channel><title>Saeid Zebardast&#039;s Blog &#187; Java</title> <atom:link href="http://zebardast.ir/en/category/java/feed/" rel="self" type="application/rss+xml" /><link>http://zebardast.ir/en</link> <description></description> <lastBuildDate>Sun, 23 Oct 2011 18:18:07 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=</generator><image><title>Saeid Zebardast&#039;s Blog</title> <url>http://0.gravatar.com/avatar/1518e6b905d65cbe0a03243a199e18fc.png?s=48</url><link>http://zebardast.ir/en</link> </image> <item><title>How to get pure content from HTML page in Java via Regex</title><link>http://zebardast.ir/en/how-to-get-pure-content-from-html-page-in-java-via-regex/</link> <comments>http://zebardast.ir/en/how-to-get-pure-content-from-html-page-in-java-via-regex/#comments</comments> <pubDate>Wed, 19 Jan 2011 14:19:00 +0000</pubDate> <dc:creator>Saeid Zebardast</dc:creator> <category><![CDATA[howto]]></category> <category><![CDATA[Java]]></category> <category><![CDATA[html]]></category> <category><![CDATA[Regex]]></category><guid isPermaLink="false">http://zebardast.ir/en/?p=177</guid> <description><![CDATA[Introduction I&#8217;ve written a web crawler while I was developing a search engine a few weeks ago. It extracts the contents and saves them onto the database. The HTML tags aren&#8217;t so important to most of the search engines. So, I removed them successfully. To do the same, follow below steps: 1- Remove the script [...] No related posts.]]></description> <content:encoded><![CDATA[<p><strong>Introduction</strong><br /> I&#8217;ve written a web crawler while I was developing a search engine a few weeks ago. It extracts the contents and saves them onto the database. The HTML tags aren&#8217;t so important to most of the search engines. So, I removed them successfully. To do the same, follow below steps:<br /> 1- Remove the script tags and inclusive content:</p><pre class="brush: java; title: ; notranslate">
// htmlContent is full content of page with HTML codes.

String content;
Pattern pattern;

pattern = Pattern.compile(&quot;&lt;script.*?&gt;.*?&lt;/script&gt;&quot;, Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
content = pattern.matcher(htmlContent).replaceAll(&quot;&quot;);
</pre><p><strong>Note:</strong> In dotall mode, the expression &lt;tt&gt;.&lt;/tt&gt; matches any character, including a line terminator. By default this expression does not match line terminators.</p><p>2- Remove the style tags and inclusive content:</p><pre class="brush: java; title: ; notranslate">
String content;
Pattern pattern;

pattern = Pattern.compile(&quot;&lt;style.*?&gt;.*?&lt;/style&gt;&quot;, Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
content = pattern.matcher(content).replaceAll(&quot;&quot;);
</pre><p>3- Remove all HTML tags without inclusive content.</p><pre class="brush: java; title: ; notranslate">
pattern = Pattern.compile(&quot;&lt;[^&gt;]*&gt;&quot;);
content = pattern.matcher(content).replaceAll(&quot;&quot;);
</pre><p>4- Replace new lines, tabs and multiple spaces with a single space.</p><pre class="brush: java; title: ; notranslate">
content = content.replaceAll(&quot;\n+&quot;, &quot; &quot;);
content = content.replaceAll(&quot;\t+&quot;, &quot; &quot;);
content = content.replaceAll(&quot;(  )+&quot;, &quot;&quot;);
</pre><p>And you have a pure content now <img src='http://zebardast.ir/en/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p><p><strong>Links</strong><br /> <a href="http://en.wikipedia.org/wiki/Regular_expression">Regular expression</a><br /> <a href="http://www.infernodevelopment.com/how-write-html-parser-java">How to Write an HTML Parser in Java</a><br /> <a href="http://www.regular-expressions.info/" title="Regex Tutorial, Examples and Reference">Regular-Expressions.info</a></p><div class="wp-biographia-container-top" style="background-color:#FFEAA8;"><div class="wp-biographia-pic"><img alt='' src='http://1.gravatar.com/avatar/1518e6b905d65cbe0a03243a199e18fc?s=100&amp;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D100&amp;r=G' class='avatar avatar-100 photo' height='100' width='100' /></div><div class="wp-biographia-text"><h3>About <a href="http://zebardast.ir/en/author/admin/" title="Saeid Zebardast">Saeid Zebardast</a></h3><p>I'm Senior software engineer with 5+ years of professional experience includes cross-platform proficiency with considerable knowledge of programming languages especially Java and programming paradigms such as OO and development methodologies. Also I'm MySQL DBA since 2006.</p><small><a href="mailto:s&#97;&#101;&#105;d.ze&#98;&#97;&#114;&#100;as&#116;&#64;&#103;&#109;ai&#108;&#46;&#99;&#111;m" title="Send Saeid Zebardast Mail">Mail</a> | <a href="http://zebardast.ir/" title="Saeid Zebardast On The Web">Web</a> | <a href="https://twitter.com/#!/saeid" title="Saeid Zebardast On Twitter">Twitter</a> | <a href="https://www.facebook.com/saeid.zebardast" title="Saeid Zebardast On Facebook">Facebook</a> | <a href="http://www.linkedin.com/in/saeid" title="Saeid Zebardast On LinkedIn">LinkedIn</a> | <a href="https://plus.google.com/112638433061122581433" title="Saeid Zebardast On Google+">Google+</a> | <a href="http://zebardast.ir/en/author/admin/" title="More Posts By Saeid Zebardast">More Posts (31)</a></small></div></div><p><a class="a2a_button_google_plus" href="http://www.addtoany.com/add_to/google_plus?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Fhow-to-get-pure-content-from-html-page-in-java-via-regex%2F&amp;linkname=How%20to%20get%20pure%20content%20from%20HTML%20page%20in%20Java%20via%20Regex" title="Google+" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/google.png" width="16" height="16" alt="Google+"/></a><a class="a2a_button_facebook" href="http://www.addtoany.com/add_to/facebook?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Fhow-to-get-pure-content-from-html-page-in-java-via-regex%2F&amp;linkname=How%20to%20get%20pure%20content%20from%20HTML%20page%20in%20Java%20via%20Regex" title="Facebook" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/facebook.png" width="16" height="16" alt="Facebook"/></a><a class="a2a_button_twitter" href="http://www.addtoany.com/add_to/twitter?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Fhow-to-get-pure-content-from-html-page-in-java-via-regex%2F&amp;linkname=How%20to%20get%20pure%20content%20from%20HTML%20page%20in%20Java%20via%20Regex" title="Twitter" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/twitter.png" width="16" height="16" alt="Twitter"/></a><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fzebardast.ir%2Fen%2Fhow-to-get-pure-content-from-html-page-in-java-via-regex%2F&amp;title=How%20to%20get%20pure%20content%20from%20HTML%20page%20in%20Java%20via%20Regex" id="wpa2a_2"><span style='display:none'>Share</span></a></p><p>No related posts.</p>]]></content:encoded> <wfw:commentRss>http://zebardast.ir/en/how-to-get-pure-content-from-html-page-in-java-via-regex/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Java: run command as root by Runtime.getRuntime().exec() in Ubuntu</title><link>http://zebardast.ir/en/java-run-command-as-root-by-runtime-getruntime-exec-in-ubuntu/</link> <comments>http://zebardast.ir/en/java-run-command-as-root-by-runtime-getruntime-exec-in-ubuntu/#comments</comments> <pubDate>Mon, 06 Dec 2010 07:53:33 +0000</pubDate> <dc:creator>Saeid Zebardast</dc:creator> <category><![CDATA[Java]]></category> <category><![CDATA[Tips]]></category><guid isPermaLink="false">http://zebardast.ir/en/?p=154</guid> <description><![CDATA[Hey a few days ago I needed to run `/etc/init.d/networking restart` command by Runtime.getRuntime().exec() in Java EE web application. The first and easiest way that came to mind was sudo without password and&#8230; It Worked! * To execute sudo without password, open /etc/sudoers by text editor like `nano`: And add your user or group to [...] Related posts:<ol><li><a href='http://zebardast.ir/en/root-terminal-in-ubuntu/' rel='bookmark' title='Root Terminal in Ubuntu'>Root Terminal in Ubuntu</a></li><li><a href='http://zebardast.ir/en/installing-sun-jdk-5-on-ubuntu-9-10-and-10-04/' rel='bookmark' title='Installing Sun JDK 5 on Ubuntu 9.10 and 10.04'>Installing Sun JDK 5 on Ubuntu 9.10 and 10.04</a></li></ol>]]></description> <content:encoded><![CDATA[<p>Hey <img src='http://zebardast.ir/en/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p><p>a few days ago I needed to run `/etc/init.d/networking restart` command by Runtime.getRuntime().exec() in Java EE web application. The first and easiest way that came to mind was sudo without password and&#8230; It Worked!<br /> * To execute sudo without password, open /etc/sudoers by text editor like `nano`:</p><pre class="brush: bash; title: ; notranslate">
$ sudo nano /etc/sudoers
</pre><p>And add your user or group to the end of file like below:</p><pre class="brush: bash; title: ; notranslate">
# for user
USER_NAME ALL= NOPASSWD: ALL

# for group
%GROUP_NAME ALL= NOPASSWD: ALL
</pre><p>let&#8217;s see my Java code:</p><pre class="brush: java; title: ; notranslate">
String command = &quot;sudo /etc/init.d/networking restart&quot;;
Runtime runtime = Runtime.getRuntime();
try {
    Process process = runtime.exec(command);
    BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(process.getInputStream()));
    String line;
    while ((line = bufferedReader.readLine()) != null) {
        System.out.println(line);
    }
} catch (IOException e) {
    e.printStackTrace();
}
</pre><p><strong>Troubleshooting</strong><br /> if you get `sudo: no tty present and no askpass program specified` error, make sure the user that runs command is in /etc/sudoers.</p><p>let me know if you find similar or easier way <img src='http://zebardast.ir/en/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p><div class="wp-biographia-container-top" style="background-color:#FFEAA8;"><div class="wp-biographia-pic"><img alt='' src='http://1.gravatar.com/avatar/1518e6b905d65cbe0a03243a199e18fc?s=100&amp;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D100&amp;r=G' class='avatar avatar-100 photo' height='100' width='100' /></div><div class="wp-biographia-text"><h3>About <a href="http://zebardast.ir/en/author/admin/" title="Saeid Zebardast">Saeid Zebardast</a></h3><p>I'm Senior software engineer with 5+ years of professional experience includes cross-platform proficiency with considerable knowledge of programming languages especially Java and programming paradigms such as OO and development methodologies. Also I'm MySQL DBA since 2006.</p><small><a href="mailto:s&#97;eid.z&#101;bar&#100;&#97;s&#116;&#64;&#103;&#109;&#97;&#105;l.&#99;&#111;m" title="Send Saeid Zebardast Mail">Mail</a> | <a href="http://zebardast.ir/" title="Saeid Zebardast On The Web">Web</a> | <a href="https://twitter.com/#!/saeid" title="Saeid Zebardast On Twitter">Twitter</a> | <a href="https://www.facebook.com/saeid.zebardast" title="Saeid Zebardast On Facebook">Facebook</a> | <a href="http://www.linkedin.com/in/saeid" title="Saeid Zebardast On LinkedIn">LinkedIn</a> | <a href="https://plus.google.com/112638433061122581433" title="Saeid Zebardast On Google+">Google+</a> | <a href="http://zebardast.ir/en/author/admin/" title="More Posts By Saeid Zebardast">More Posts (31)</a></small></div></div><p><a class="a2a_button_google_plus" href="http://www.addtoany.com/add_to/google_plus?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Fjava-run-command-as-root-by-runtime-getruntime-exec-in-ubuntu%2F&amp;linkname=Java%3A%20run%20command%20as%20root%20by%20Runtime.getRuntime%28%29.exec%28%29%20in%20Ubuntu" title="Google+" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/google.png" width="16" height="16" alt="Google+"/></a><a class="a2a_button_facebook" href="http://www.addtoany.com/add_to/facebook?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Fjava-run-command-as-root-by-runtime-getruntime-exec-in-ubuntu%2F&amp;linkname=Java%3A%20run%20command%20as%20root%20by%20Runtime.getRuntime%28%29.exec%28%29%20in%20Ubuntu" title="Facebook" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/facebook.png" width="16" height="16" alt="Facebook"/></a><a class="a2a_button_twitter" href="http://www.addtoany.com/add_to/twitter?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Fjava-run-command-as-root-by-runtime-getruntime-exec-in-ubuntu%2F&amp;linkname=Java%3A%20run%20command%20as%20root%20by%20Runtime.getRuntime%28%29.exec%28%29%20in%20Ubuntu" title="Twitter" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/twitter.png" width="16" height="16" alt="Twitter"/></a><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fzebardast.ir%2Fen%2Fjava-run-command-as-root-by-runtime-getruntime-exec-in-ubuntu%2F&amp;title=Java%3A%20run%20command%20as%20root%20by%20Runtime.getRuntime%28%29.exec%28%29%20in%20Ubuntu" id="wpa2a_4"><span style='display:none'>Share</span></a></p><p>Related posts:<ol><li><a href='http://zebardast.ir/en/root-terminal-in-ubuntu/' rel='bookmark' title='Root Terminal in Ubuntu'>Root Terminal in Ubuntu</a></li><li><a href='http://zebardast.ir/en/installing-sun-jdk-5-on-ubuntu-9-10-and-10-04/' rel='bookmark' title='Installing Sun JDK 5 on Ubuntu 9.10 and 10.04'>Installing Sun JDK 5 on Ubuntu 9.10 and 10.04</a></li></ol></p>]]></content:encoded> <wfw:commentRss>http://zebardast.ir/en/java-run-command-as-root-by-runtime-getruntime-exec-in-ubuntu/feed/</wfw:commentRss> <slash:comments>2</slash:comments> </item> <item><title>Installing Sun JDK 5 on Ubuntu 9.10 and 10.04</title><link>http://zebardast.ir/en/installing-sun-jdk-5-on-ubuntu-9-10-and-10-04/</link> <comments>http://zebardast.ir/en/installing-sun-jdk-5-on-ubuntu-9-10-and-10-04/#comments</comments> <pubDate>Mon, 03 May 2010 02:21:19 +0000</pubDate> <dc:creator>Saeid Zebardast</dc:creator> <category><![CDATA[Java]]></category> <category><![CDATA[Tips]]></category> <category><![CDATA[Ubuntu]]></category><guid isPermaLink="false">http://zebardast.ir/en/?p=107</guid> <description><![CDATA[Hello As you known, Sun JDK version 1.5 or 5 is deleted from Ubuntu 10.4 and 9.10 repositories and the version 6 has been replaced. The easiest way to install Sun JDK 5 version is add its repository from Ubuntu 9.04 to the list of repositories in 9.10 and 10.04. For this purpose, follow the [...] Related posts:<ol><li><a href='http://zebardast.ir/en/webmin-installing-on-ubuntu-gutsy-gibbon-710/' rel='bookmark' title='Webmin, Installing on Ubuntu Gutsy Gibbon (7.10)'>Webmin, Installing on Ubuntu Gutsy Gibbon (7.10)</a></li><li><a href='http://zebardast.ir/en/how-to-install-gos-on-ubuntu-gutsy-gibbon/' rel='bookmark' title='How to install gOS on Ubuntu Gutsy Gibbon'>How to install gOS on Ubuntu Gutsy Gibbon</a></li><li><a href='http://zebardast.ir/en/root-terminal-in-ubuntu/' rel='bookmark' title='Root Terminal in Ubuntu'>Root Terminal in Ubuntu</a></li></ol>]]></description> <content:encoded><![CDATA[<p>Hello <img src='http://zebardast.ir/en/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p><p>As you known, Sun <abbr title="Java Development Kit">JDK</abbr> version 1.5 or 5  is deleted from Ubuntu 10.4 and 9.10 repositories and the version 6 has been replaced.</p><p>The easiest way to install Sun JDK 5 version is add its repository from Ubuntu 9.04 to the list of repositories in  9.10 and 10.04. For this purpose, follow the steps.</p><p>1- Open /etc/apt/sources.list with a text editor like gedit:</p><pre>sudo gedit /etc/apt/sources.list</pre><p>2- Add the following lines to the end of the file then save it and close:</p><pre>  ## For sun-java5-jdk
 deb http://ir.archive.ubuntu.com/ubuntu jaunty-updates main multiverse</pre><p>3- Update the packages lists and install sun-java5-jdk:</p><pre> sudo aptitude update
 sudo aptitude install sun-java5-jdk</pre><p><em>* Above method can be used for other applications.</em></p><p>Another way to install jdk 5 is download software package and its dependencies from <a href="http://packages.ubuntu.com">packages.ubuntu.com</a>.</p><p>Good luck</p><div class="wp-biographia-container-top" style="background-color:#FFEAA8;"><div class="wp-biographia-pic"><img alt='' src='http://1.gravatar.com/avatar/1518e6b905d65cbe0a03243a199e18fc?s=100&amp;d=http%3A%2F%2F1.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D100&amp;r=G' class='avatar avatar-100 photo' height='100' width='100' /></div><div class="wp-biographia-text"><h3>About <a href="http://zebardast.ir/en/author/admin/" title="Saeid Zebardast">Saeid Zebardast</a></h3><p>I'm Senior software engineer with 5+ years of professional experience includes cross-platform proficiency with considerable knowledge of programming languages especially Java and programming paradigms such as OO and development methodologies. Also I'm MySQL DBA since 2006.</p><small><a href="mailto:s&#97;eid&#46;&#122;eb&#97;r&#100;&#97;&#115;&#116;&#64;&#103;m&#97;&#105;&#108;&#46;com" title="Send Saeid Zebardast Mail">Mail</a> | <a href="http://zebardast.ir/" title="Saeid Zebardast On The Web">Web</a> | <a href="https://twitter.com/#!/saeid" title="Saeid Zebardast On Twitter">Twitter</a> | <a href="https://www.facebook.com/saeid.zebardast" title="Saeid Zebardast On Facebook">Facebook</a> | <a href="http://www.linkedin.com/in/saeid" title="Saeid Zebardast On LinkedIn">LinkedIn</a> | <a href="https://plus.google.com/112638433061122581433" title="Saeid Zebardast On Google+">Google+</a> | <a href="http://zebardast.ir/en/author/admin/" title="More Posts By Saeid Zebardast">More Posts (31)</a></small></div></div><p><a class="a2a_button_google_plus" href="http://www.addtoany.com/add_to/google_plus?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Finstalling-sun-jdk-5-on-ubuntu-9-10-and-10-04%2F&amp;linkname=Installing%20Sun%20JDK%205%20on%20Ubuntu%209.10%20and%2010.04" title="Google+" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/google.png" width="16" height="16" alt="Google+"/></a><a class="a2a_button_facebook" href="http://www.addtoany.com/add_to/facebook?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Finstalling-sun-jdk-5-on-ubuntu-9-10-and-10-04%2F&amp;linkname=Installing%20Sun%20JDK%205%20on%20Ubuntu%209.10%20and%2010.04" title="Facebook" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/facebook.png" width="16" height="16" alt="Facebook"/></a><a class="a2a_button_twitter" href="http://www.addtoany.com/add_to/twitter?linkurl=http%3A%2F%2Fzebardast.ir%2Fen%2Finstalling-sun-jdk-5-on-ubuntu-9-10-and-10-04%2F&amp;linkname=Installing%20Sun%20JDK%205%20on%20Ubuntu%209.10%20and%2010.04" title="Twitter" rel="nofollow" target="_blank"><img src="http://zebardast.ir/en/wp-content/plugins/add-to-any/icons/twitter.png" width="16" height="16" alt="Twitter"/></a><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fzebardast.ir%2Fen%2Finstalling-sun-jdk-5-on-ubuntu-9-10-and-10-04%2F&amp;title=Installing%20Sun%20JDK%205%20on%20Ubuntu%209.10%20and%2010.04" id="wpa2a_6"><span style='display:none'>Share</span></a></p><p>Related posts:<ol><li><a href='http://zebardast.ir/en/webmin-installing-on-ubuntu-gutsy-gibbon-710/' rel='bookmark' title='Webmin, Installing on Ubuntu Gutsy Gibbon (7.10)'>Webmin, Installing on Ubuntu Gutsy Gibbon (7.10)</a></li><li><a href='http://zebardast.ir/en/how-to-install-gos-on-ubuntu-gutsy-gibbon/' rel='bookmark' title='How to install gOS on Ubuntu Gutsy Gibbon'>How to install gOS on Ubuntu Gutsy Gibbon</a></li><li><a href='http://zebardast.ir/en/root-terminal-in-ubuntu/' rel='bookmark' title='Root Terminal in Ubuntu'>Root Terminal in Ubuntu</a></li></ol></p>]]></content:encoded> <wfw:commentRss>http://zebardast.ir/en/installing-sun-jdk-5-on-ubuntu-9-10-and-10-04/feed/</wfw:commentRss> <slash:comments>25</slash:comments> </item> </channel> </rss>
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced
Database Caching 12/37 queries in 0.010 seconds using disk: basic
Object Caching 1568/1597 objects using disk: basic

Served from: zebardast.ir @ 2012-02-10 10:59:12 -->
