Sunday, April 16, 2023

A Comprehensive Guide to Reading and Writing CSV Files in Java

A Comprehensive Guide to Reading and Writing CSV Files in Java

A Comprehensive Guide to Reading and Writing CSV Files in Java

Introduction:

CSV (Comma-Separated Values) is a popular file format used for storing tabular data, where data values are separated by commas or other delimiters. It is commonly used for data exchange between different systems and is supported by many spreadsheet applications. In this blog post, we will explore how to read and write CSV files in Java, a versatile and widely-used programming language.

Reading CSV Files in Java:

To read CSV files in Java, we can use the built-in java.io package, which provides classes for reading and writing text files. Here's a step-by-step guide on how to read CSV files in Java:

        
            BufferedReader br = new BufferedReader(new FileReader("example.csv"));
            String line;
       		while ((line = br.readLine()) != null) {
            String[] fields = line.split(",");
        	}
        	br.close();
    
    

We can use the BufferedWriter class to write text to a character-output stream. We need to pass a FileWriter object to the BufferedWriter constructor, which represents the CSV file we want to write. Here's an example:

    
    BufferedWriter bw = new BufferedWriter(new FileWriter("example.csv"));
    bw.write("Field1,Field2,Field3");
	bw.newLine();
	bw.write("Value1,Value2,Value3");
    

Using a CSV Library - OpenCSV:

Now we will explore how to perform CSV file operations in Java using a CSV library. To begin, you need to add a CSV library to your Java project. There are several libraries available, such as OpenCSV, Apache Commons CSV, and Super CSV, that provide APIs for reading and writing CSV files. For this tutorial, we will use OpenCSV, a widely used and popular CSV library in Java.

You can add OpenCSV to your project by including the following Maven or Gradle dependency in your project's build file:

    
        <dependency>
            <groupId>com.opencsv</groupId>
            <artifactId>opencsv</artifactId>
            <version>5.5.2</version>
        </dependency>
    

Here's an example of how you can read a CSV file using OpenCSV:

    
        import com.opencsv.CSVReader;
        import java.io.FileReader;
        import java.io.IOException;

        public class CsvReaderExample {
            public static void main(String[] args) {
                String csvFile = "path/to/your/csv/file.csv";
                try (CSVReader reader = new CSVReader(new FileReader(csvFile))) {
                    String[] line;
                    while ((line = reader.readNext()) != null) {
                        // Process the CSV data
                        for (String data : line) {
                            System.out.print(data + " ");
                        }
                        System.out.println();
                    }
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    

Here's an example of how you can write data to a CSV file using OpenCSV:

    
        import com.opencsv.CSVWriter;
        import java.io.FileWriter;
        import java.io.IOException;

        public class CsvWriterExample {
            public static void main(String[] args) {
                String csvFile = "path/to/your/csv/file.csv";
                try (CSVWriter writer = new CSVWriter
            // Write data to the CSV file
            String[] data1 = {"ABC", "EFG", "30"};
            writer.writeNext(data1);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}
    

Saturday, June 2, 2012

XML Response in Python

Writing an XML response doc in python is pretty easy.
While working on one of the projects i wrote some
methods thats make it even easy to use:


import xml.dom.minidom


class MyXml:
    def __init__(self):
        self.doc = xml.dom.minidom.Document()

    def add_root(self, node_str):
        """creates and returns root node"""
        root = self.doc.createElementNS("http://mynamespace.com", node_str)
        self.doc.appendChild(root)
        return root       


    def add_node(self, node, node_str):
        """creates and returns a child node"""
        ch_node = self.doc.createElementNS("http://mynamespace.com", node_str)
        node.appendChild(ch_node)
        return root
       
    def add_txt_value(self, node, value):
        """creates a text node and appends to existing node"""
        txt_node = self.doc.createTextNode(str(value))
        node.appendChild(txt_node)


#==================================================
# example to create a xml response document you can simply add nodes and text
#as given below
#<?xml version="1.0" encoding="utf-8"?>
# <response>
#       <success> Hey i got your msg</success>
# </response>
#==================================================


if __name__ == '__main__':
    xmlObj = MyXml()
    #to create root node
    root = xmlObj.add_root("response")
    #to add child node arg1 parent node, arg2 child node
    node1 = xmlObj.add_node(root, "success")
    #to add success string to success node
    xmlObj.add_txt_value(node1, "Hey i got your msg")

Wednesday, January 4, 2012

Mahout Recommendation Engine

Apache mahout implements scalable data mining algorithms over apache hadoop. Classification , clustering and collaborative filtering algorithms are implemented in mahout that can be used for analyzing large scale data and predicting user behavior.

Mahout implements collaborative filtering based on :
1. User Preferences
2. Item similarity (product similarity)

Here i am giving a sample code for item similarity based recommendation building.
Requirements:
1. For building mahout project one needs maven.
2. InputFile : content of the file will be like :
userid, itemid, preference
101,202,3
101,203,5
102,202,2
note: both userid and item id are supposed to be long type and preference is supposed to be of float type.
string is not supported by mahout recommendation API so you need to resolve your data in IDs before feeding into mahout recommender.

Output: Given code takes input in above given format and write output in given file as :
user,recom1,recom2,recom3,recom4,recom5
Note: Recommendations will be arranged in descending order of recommendation strength. If customer preference is not known and then in that case there will be no ordering and given below recommender will be converted to binary recommender , that means either you like some product (1) or you don't like that product (0).


import java.io.File;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender;
import org.apache.mahout.cf.taste.recommender.*;
import org.apache.mahout.cf.taste.model.*;
import org.apache.mahout.cf.taste.eval.*;
import org.apache.mahout.common.*;
import java.io.FileWriter;
import java.io.BufferedWriter;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.impl.common.LongPrimitiveIterator;
import org.apache.mahout.cf.taste.common.Weighting;
import org.apache.mahout.cf.taste.impl.recommender.slopeone.MemoryDiffStorage;
import org.apache.mahout.cf.taste.recommender.slopeone.DiffStorage;


public class HiveLog {
   
    public static void main(String... args) throws Exception
   {
       
        // create data source (model) - from the csv file          

        File inputFile= new File("/home/test/test_input.csv");

        final DataModel model = new FileDataModel( inputFile );
        FileWriter fstream=new FileWriter("/home/test/recommendation.csv",true);
        BufferedWriter out=new BufferedWriter(fstream);      

RecommenderBuilder recommenderBuilder=new RecommenderBuilder(){
@Override
public Recommender buildRecommender(DataModel model) throws TasteException {

DiffStorage diffStorage = new MemoryDiffStorage( model, Weighting.WEIGHTED, Long.MAX_VALUE);
return new SlopeOneRecommender(model,Weighting.WEIGHTED, Weighting.WEIGHTED, diffStorage);
}
};


Recommender recommender=recommenderBuilder.buildRecommender(model);
      // for all users
        for (LongPrimitiveIterator it = model.getUserIDs(); it.hasNext();)
{
          long userId = it.nextLong();
           
            // get the recommendations for the user
            List<RecommendedItem> recommendations = recommender.recommend(userId,8);
            int i=0;
            for (RecommendedItem recommendedItem : recommendations)
   {
if (i==0)
{
  out.write(userId+","+recommendedItem.getItemID());
i++;
}
else
{
out.write(","+recommendedItem.getItemID());
i++;
          }
   }
out.newLine();
        }  

out.close();
  }
}


For more details and mahout algorithms implementation  please write.



  

Thursday, February 17, 2011

Cron Configuration of "crontab" on prior Installation of Cygwin (Tested with Windows XP Only)

Hi,
You want to configure crontab on prior installation of cygwin and getting error like :
Error starting service: 1060

You can get rid of the problem by removing the old version cygwin dll (cygwin1.dll) file that can be found in root windows installation directory like C:\WINDOWS\system32 or if  Open ssh is installed then it would be in : C:\Program Files\OpenSSH\bin directory.

To remove this file you need to change permissions of the file :

chmod 777 /cygdrive/c/WINDOWS/system32/cygwin1.dll
don't worry it will change the permission of your cygwin1.dll file only.
Once file permission have updated you can remove this file like :
rm -f /cygdrive/c/WINDOWS/system32/cygwin1.dll
 
now follow the given below steps and your cron is ready to use. 

pawan.singh@pawanksingh ~
$ cron-config
Cron is already installed as a service under account LocalSystem.
Do you want to remove or reinstall it? (yes/no) yes
OK. The cron service was removed.

Do you want to install the cron daemon as a service? (yes/no) yes
Enter the value of CYGWIN for the daemon: [ ] ntsec

You must decide under what account the cron daemon will run.
If you are the only user on this machine, the daemon can run as yourself.
   This gives access to all network drives but only allows you as user.
Otherwise cron should run under the local system account.
  It will be capable of changing to other users without requiring a
  password, using one of the three methods detailed in
  http://cygwin.com/cygwin-ug-net/ntsec.html#ntsec-nopasswd1
Do you want the cron daemon to run as yourself? (yes/no) no


Running cron_diagnose ...
WARNING: Your computer does not appear to have a cron table for pawan.singh.
Please generate a cron table for pawan.singh using 'crontab -e'

... no problem found.

Do you want to start the cron daemon as a service now? (yes/no) yes
OK. The cron daemon is now running.

In case of problem, examine the log file for cron,
/var/log/cron.log, and the Windows event log (using /usr/bin/cronevents)
for information about the problem cron is having.

Examine also any cron.log file in the HOME directory
(or the file specified in MAILTO) and cron related files in /tmp.

If you cannot fix the problem, then report it to cygwin@cygwin.com.
Please run the script /usr/bin/cronbug and ATTACH its output
(the file cronbug.txt) to your e-mail.

WARNING: PATH may be set differently under cron than in interactive shells.
         Names such as "find" and "date" may refer to Windows programs. 


Wednesday, February 9, 2011

Manage ssh sessions

Hi All,

There is a small script that manages idle ssh connections on the basis of idle hour diff and idle minute difference . Script is very simple and self explanatory. Even though if there is any confusion you  can write me back.


#!/bin/bash
## Written By : jeet27.pawan@gmail.com
## last Updated: 0000-00-00

###### get Command line input ########
# arg1: idle connection hour diff                 #
# arg2: idle connection minute diff              #
#################################

if [ $# == 2 ]; then
idl_hr_diff=$1
idl_mt_diff=$2
elif [ $# == 1 ]; then
idl_hr_diff=$1
idl_mt_diff=0
else
echo "please enter at least one command line argument as idle time hour diff"
exit
fi

list=`ps -W | grep 'sh.exe' | awk '{print $1":"$7}'`;
month=`date | awk '{print $2}'`;
prevmonth=`date -d 'last month' '+%b'`;
#echo $month
hr=`date | awk '{print $4}' | cut -d ':' -f 1`;
mt=`date | awk '{print $4}' | cut -d ':' -f 2`;
#echo $list
for p in $list
do
{
pid=`echo $p| cut -d ':' -f 1`;
hour=`echo $p| cut -d ':' -f 2`;
#echo $hour
minute=`echo $p| cut -d ':' -f 3`;
#echo $minute
#echo $hr
#echo $mt
    if [[ "$month" == "$hour" ]]; then
    {
#        echo 1
        echo "Killing ssh session with pid :"$pid;
        /usr/bin/kill -f $pid;
    }
    elif [[ "$prevmonth" == "$hour" ]]; then
    {
#        echo 11
        $hour=0
        let hour_diff=$hr-$hour
        if [[ "$hour_diff" -gt "$idl_hr_diff" ]] ;then
        {
            echo "Killing ssh session with pid :"$pid;
            /usr/bin/kill -f $pid;
        }
        fi
    }
    else
    {
    let hour_diff=$hr-$hour
    let mid_night_hour_diff=$hour-$hr
#    echo "hello "$hour_diff
    let min_diff=$mt-$minute
#    echo "hi "$min_diff

        if [[ "$hour_diff" -ge "$idl_hr_diff" ]];then
            {   
            if [[ "$min_diff" -ge "$idl_mt_diff" ]]; then
            {
#            echo 2
            echo "Killing ssh session with pid :"$pid
            /usr/bin/kill -f $pid;
            }
            else
            {
#            echo 3
            echo "Ideal time less then threshold value. Skipping process with pid: "$pid
            }
            fi
            }
        else
            {
#            echo 4
            echo "Ideal time less then threshold value. Skipping process with pid: "$pid
            }
        fi
    }
    fi
}
done