Sunday, November 6, 2016

Hive: Auto Increment column & Incrementing existing row_sequence

We already have a existed jar for this, no need to write our own udf.                                                                 Down load (hive-contrib-1.1.0.jar)jar in below link ,                                                                                       http://www.mvnrepository.com/artifact/org.apache.hive/hive-contrib/1.1.0                                                                                                                                                                                      hive> add jar /home/hadoop/Desktop/hive-contrib-1.1.0.jar;

hive> CREATE TEMPORARY FUNCTION row_sequence as 'org.apache.hadoop.hive.contrib.udf.UDFRowSequence';
OK

CREATE TABLE IF NOT EXISTS users_inc
(
ID string,
name  string
) row format delimited fields terminated by ',' stored as textfile;

id ID value is int ,Then (ID int ...use this when creating table)
hive> insert into table users_inc
select m.max+row_sequence() as inc , ename
from (select ename from emp_csv) e
join
(select max(ID) as max from users_inc) m;

handling nulls when inserting table first time : (max val we get null when table is empty)
================
insert into table users_inc
select m.max+row_sequence() as inc , ename
from (select ename from emp_csv) e
join
(select coalesce(max(ID),0) as max from users_inc) m;

select * from users_inc;


===>for alpha numeric id increment value
insert into table users_inc
select concat("ABC-",m.max+row_sequence()) as inc , ename
from (select ename from emp_csv) e
join
(select coalesce(max(substr(ID,5)),0) as max from users_inc) m;

---->regexp_replace
insert into table users_inc
select regexp_replace(concat("ABC-",m.max+row_sequence()),'\\.0','') as inc, ename from (select ename from emp_csv) e
join
(select coalesce(max(substr(ID,5)),0) as max from users_inc) m;                                                                                                                                                                                                                                coalesce() : COALESCE(T v1, T v2, ...)
Returns the first v that is not NULL in list of values,                                                                             or   NULL if all v's are NULL.

Wednesday, January 20, 2016

unix path classpath settings

PATH=$PATH\:/u01/app/oracle/product/12.1.0/client_1/bin ; export PATH
PATH=$PATH\:/u01/app/oracle/product/12.1.0/client_1/bin ; export PATH




setenv ORACLE_HOME $ORACLE_HOME\:/dir/path

export ORACLE_HOME=/u01/app/oracle/product/12.1.0/client_1;
export ORACLE_BASE=/u01/app/oracle;
export ORACLE_SID=EFMT;




https://somireddy.wordpress.com/2013/04/22/sp2-0667-message-file-sp1-msb-not-found/



export LD_LIBRARY_PATH=/u01/app/oracle/product/12.1.0/client_1/lib;
export TNS_ADMIN=/u01/app/oracle/product/12.1.0/client_1/network/admin;


 sqlplus usrname/pwd




----->JAVA unix
export CLASSPATH=/opt/IBM/WebSphere/AppServer/java_1.7.1_64/lib:.
PATH=$PATH\:/opt/IBM/WebSphere/AppServer/java_1.7.1_64/bin;export PATH

unix script to connect oracle DB

Connect_db_oracle_unix.sh
#This Next line contains the name of a file, if you want to save all data #outside.
export ORACLE_HOME=/u01/app/oracle/product/12.1.0/client_1;
export ORACLE_BASE=/u01/app/oracle;
export ORACLE_SID=ORCL;//your db sid value

PATH=$PATH\:/u01/app/oracle/product/12.1.0/client_1/bin ; export PATH;

file="outputfile.txt"
sqlplus -s usrnm/pwd<< EOF
set linesize 10000
set head off
SET COLSEP "|"

spool $file

SELECT * from users where rownum<5;

spool off

MQ java api code to retrieve all msgs

private void receiveAllMQmsgs()
{
   int openOptions = CMQC.MQOO_INQUIRE + CMQC.MQOO_INPUT_SHARED + CMQC.MQOO_FAIL_IF_QUIESCING;
   MQGetMessageOptions getOptions = new MQGetMessageOptions();
   getOptions.options = CMQC.MQGMO_NO_WAIT + CMQC.MQGMO_FAIL_IF_QUIESCING;
   boolean getMore = true;
   MQQueueManager qMgr = null;
   MQQueue queue = null;
   MQMessage receiveMsg = null;

   try
   {
      qMgr = new MQQueueManager(qManager);
      queue = qMgr.accessQueue(inputQName, openOptions);

      while(getMore)
      {
         try
         {
            receiveMsg = new MQMessage();
            queue.get(receiveMsg, getOptions);
            byte[] b = new byte[receiveMsg.getMessageLength()];
            receiveMsg.readFully(b);
            System.out.println("Message-->" + new String(b));
         }
         catch (MQException e)
         {
            if ( (e.completionCode == CMQC.MQCC_WARNING) &&
                 (e.reasonCode == CMQC.MQRC_NO_MSG_AVAILABLE) )
            {
               System.out.println("Bottom of the queue reached.");
               getMore = false;
            }
            else
            {
               System.err.println("MQRead CC=" +e.completionCode + " : RC=" + e.reasonCode);
               getMore = false;
            }
         }
         catch (IOException e)
         {
            System.out.println("MQRead " +e.getLocalizedMessage());
         }
      }
   }
   catch (MQException e)
   {
      System.err.println("MQRead CC=" +e.completionCode + " : RC=" + e.reasonCode);
   }
   finally
   {
      try
      {
         if (queue != null)
            queue.close();
      }
      catch (MQException e)
      {
         System.err.println("MQRead CC=" +e.completionCode + " : RC=" + e.reasonCode);
      }
      try
      {
         if (qMgr != null)
            qMgr.disconnect();
      }
      catch (MQException e)
      {
         System.err.println("MQRead CC=" +e.completionCode + " : RC=" + e.reasonCode);
      }
   }
}

MQ java api useful sites


Read/write MQ  all msgs:
http://stackoverflow.com/questions/28645497/unable-to-retrieve-message-from-mq-queue

https://endrasenn.wordpress.com/2010/01/27/readwrite-to-ibm-mq-sample-java-code/---read/write
http://www.mqseries.net/phpBB/viewtopic.php?t=41438

http://www.ibm.com/developerworks/websphere/library/techarticles/0602_currie/0602_currie.html


http://bencane.com/2013/04/22/websphere-mq-cheat-sheet-for-system-administrators/
https://www-01.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.ref.adm.doc/q083180_.htm

unix cmds

$ /usr/local/bin/pbrun ksh

--> validate xml
xmllint --valid xmlfile1.xml

xmllint --valid --noout doc.xml
xmllint --schema schema.xsd doc.xml




-->uniq -d test
 Print only Duplicate Lines using -d option (http://www.thegeekstuff.com/2013/05/uniq-command-examples/)

-->List the files containing a particular word in their text
grep check * -lR

grep '/opt/BAES_HOME/conf/wlm/fpfa' * -lR  (it will show all the files having word '/opt/BAES_HOME/conf/wlm/fpfa')


---->grep -c 'id="' DJRC_WL-AMe_XML_201510072359_F.xml
count word id=" in a file
grep -o "<item>" a.xml | wc -l

--->Find Java class inside a folder of JARS

 for a in *.jar
 do
 echo $a; unzip -t $a | grep -i CLassFileName
 done
==>unzip -t my.jar
==>unzip -t quartz.jar|grep -i 'Pair.class'


-->
awk 'FNR==1 && NR==1{printf $1"|"}FNR==1 && NR!=1{printf $1"\n"}' file1.txt file2.txt

paste -d"|" file1.txt file2.txt
paste -d"|" file1.txt file2.txt
http://www.folkstalk.com/2012/09/paste-command-examples-in-unix-linux.html


-->if ORACLE_HOME is not defined
find / \( -name catalog.sql -o -name sql.bsq \)
/u01/app/oracle/product/12.1.0/client_1/bin

https://kb.iu.edu/d/acar

-->http://design.liberta.co.za/articles/how-to-remove-spaces-from-filenames-in-linuxunix/

--grep -P "[\x80-\xFF]" *.*

->to remove non printable chars in a file.

perl -pe's/[[:^ascii:]]//g' < ACCOUNTS_DAILY_20121206.txt > newfile_accounts_ascii.txt

--->cut -d "," -f1-10,20-25,30-33 infile.csv > outfile.csv
-->http://www.shellhacks.com/en/Printing-Specific-Columns-Fields-in-Bash-using-AWK
-->ls -p | grep -v /

Thursday, December 10, 2015

UNIX useful cmds




awk -F'|' '{print $63}' ACCOUNTS_DAILY_20150601.txt|grep '10'|uniq|head

awk '$63=="10"{$63="GTB"}1' FS='|' OFS='|' ACCOUNTS_DAILY_20140101.txt>temp.txt
awk '$63=="0"{$63="AML"}1' FS='|' OFS='|' ACCOUNTS_DAILY_20140101.txt
awk '$63=="1010"{$63="PFS"}1' FS='|' OFS='|' ACCOUNTS_DAILY_20140101.txt
awk '$63=="1000"{$63="0"}1' FS='|' OFS='|' ACCOUNTS_DAILY_20140101.txt


  • In vi editor open file , then shift : /wordof record
It will go to that specific record


Command purpose

rm <file name> To remove files,.txt….
rmdir <your Directory name> To Remove Directory name
rm –rf to remove nonempty dir
alias Enquire or set alias
awk Deal with tabular data
cat file* concatenate file
cd dir change working directory
chmod opts file* Change attributes and permissions
cp src* dst copy source files to destination
diff Difference between files.
display file* Display images
file file* List possible type/origin of files.
find node types find files from node matching types.
ftp host file transfer protocol
gcc GNU C Compiler (System build tool)
grep string file* find lines containing string in files
gzip file* compress files
history Command history
host str DNS lookup utility
kill N end task number N
less file Look at file
ls {dir..} list files in directories
ln {options} link files
lp file* print files
make Update series of dependent files
man topic Online help on topic
mdir floppy disk directory
mcopy floppy disk copy
more file Scroll through file or stdin
mount mount filesystems
od Look at non ascii file
ping host test connection with host
ps process status.
pwd print working directory
rxvt Open a console on X windows.
sed Batch stream editor
strings file Look at ASCII strings in a file
su {name} Swap user to root, or given name
tar opts dst file* bundle or unbundle files.
telnet host intereactive connection to another computer
touch file* Change timestamps of files
vi file* edit files. ':n' in vi gives next file
wc file* Word count files
wget options download web pages or ftp server files
xset options X-windows behaviour
xterm Set up X-terminal








  1. Basic Cursor Movement for vi editor


From Command Mode
k Up one line
j Down one line
h Left one character
l Right one character (or use <Spacebar>)
w Right one word
b Left one word
use 6k,6j…………….etc
NOTE: Many vi commands can take a leading count (e. g., 6k, 7e).

http://www.linuxnix.com/2012/07/23-awesome-less-known-linuxunix-command-chaining-examples.html

Extract a .gz file:
gzip -d file.gz
To see gzipped file without actually having to gunzip it
gunzip -c NORKOM_US_20140827.xml
or
zcat 12_TRANSACTIONS_DAILY_20141027.txt.gz|head -1
Extract a tar.gz file:
tar xvzf file.tar.gz
tar xvf file.tar
Compress into .Z format:
compress 1_TRANSACTIONS_DAILY_20120525.txt
here above .txt converts into .Z format
gzip filename.xml
it converts filename.xml.gz









Find Java class inside a folder of JARS

for a in *.jar
do
echo $a; unzip -t $a | grep -i CLassFileName
done



Sending email:
mail -s 'test email from umix ' kiran3kumar@hsbc.co.in
or
mailx -s 'test email from umix ' kiran3kumar@hsbc.co.in<mail





Extract jar file:
Jar xf xxx.jar


Extract specific class from jar use below command:
jar xf norkomextract.jar | grep -i ExtractNorkomData.class


find cmd

http://www.bsd.org/unixcmds.html

-- to find path of command in linux

whereis java

To find the path the operating system uses to execute a command when you enter it on the command line, use the which command instead, for example:
which lpr


(for ref………..http://kb.iu.edu/data/acec.html)
--find command is recommend because of speed and ability to deal with filenames that contain spaces.cd /path/to/dir
find . -type f -exec grep -l "word" {} +
find . -type f -exec grep -l "seting" {} +
find . -type f -exec grep -l "foo" {} +
find *.xml -type f -exec grep -l "table" {} +
find / -type f -name "*.sh" -exec grep -il 'KYC_PATH=' {} +
find . -type f -name kyc_common.sh
find . -type f -name "*.sh" -exec grep -il 'KYC_PATH=' {} +



Ex:
norkomsm@usvh3euap42:/opt/norkomSMDE1/norkom_home/bin> find mon_alert Start_GTB_Scenario_Manager.sh



--Retriving specific lines of a file based on matching
Grep <matching name> <filename>

grep '<parameter name="filePattern">' scheduleconfig_dataload.xml
grep '<parameter name="fileSpecPath">' scheduleconfig_dataload.xml


cat output.log | grep FAIL


I want some more information along with this. Like the 2-3 lines above this line with FAIL
grep -C 3 FAIL output.log





-to remove non-printable binary characters (garbage) from a Unix text file
tr -cd '\11\12\15\40-\176' < file-with-binary-chars > clean-file

--

(mv)Rename command syntax

rename oldname newname *.files
For example rename all *.bak file as *.txt, enter:$ rename .bak .txt *.bak
rename .txt .txt.bak *.txt
rename .txt .txt.1 *20140114.txt

---- Find a File by Name in UNIX, Solaris, or Linux
Using the find command, one can locate a file by name.

To find a file such as filename.txt anywhere on the system:
find / -name filename.txt -print
Recent operating system versions do not require the print option because this is the default. To limit the search to a specific directory such as /usr:
find /usr -name filename.txt –print


ex: find /opt/norkomSMDE4/norkom_home/conf_PFS/monitoring/ -name processingconfig_PFS_PostProcess.xml –print
find /opt/norkomSMDE4/norkom_home/ -name processingconfig_PFS_PostProcess.xml -print




--df
grep -P "[\x80-\xFF]" ACCOUNTS_UPDT_20121209.txt
LANG=C sed -i 's/[\x80-\xFF]//g' ACCOUNTS_UPDT_20121209.txt
nohup /opt/norkomSM/HSBC/scripts/shell_scripts/nrkm_monthend_mvzip.sh &



df –h
(disk freespace)
The above command is one of the most commonly used commands as it displays the sizes in an easy to read format as shown in the below example.
FilesystemSizeUsedAvailUse%Mounted on
/dev/hda2 28G 7.6G 19G29%/
tmpfs 252M 0252M0%/lib/init/rw




- grep usage

-- Display all environment variables
You can use the commands env, set, and printenv display all environment variables

-to remove non printable chars in a file.
perl -pe's/[[:^ascii:]]//g' < ACCOUNTS_DAILY_20121206.txt > newfile_accounts_ascii.txt

in the above example, ACCOUNTS_DAILY_20121206.txt has non printable chars.and newfile_accounts_ascii.txt is the file after removing non printable chars.


To check non printable chars in a file,
grep -P "[\x80-\xFF]" *.*



to see non printable ^ charactors:
cat -v NORKOM_US_20141127.xml | grep -n '\^'

to remove ^ chars:
cat -v NORKOM_US_20141127.xml | sed 's/\^.//g' > NORKOM_US_20141127.xml_removed


grep '[^[:print:]]' NORKOM_US_20141127.xml

--- http://www.aliencoders.com/content/how-remove-m-and-other-non-printable-characters-file-linux
---Removing one record/line of file








---Remove spaces in a file name
cp '1_ACCOUNTS_ DAILY_20130701.txt' `ls '1_ACCOUNTS_ DAILY_20130701.txt' | tr -d ' '`





--add header trailer to a file
sed '1i\HR|t|4' mytest.txt | sed '2i\hr2|r|0' | sed '$a\#TR|End' > mytest.txt.tmp

sed '1i\HR|t|4' mytest.txt | sed '4i\ram nest|r|0' | sed '$a\#TR|End' > mytest.txt.tmp.1

prog:
#!/bin/sh

FILES=$(echo *.in)

for file in $FILES
do
echo "Modifying file : $file"

sed '1i\
h1|t|4' $file | sed '2i\
h2|r|0' | sed '$a\
#End' > $file.tmp

mv $file.tmp $file
done


Remove HR,TR of a file,

sed '1d;$d' file > newfile

or
below is better one
sed -i'' -e '1d' -e '$d' yourfile
sed -i'' -e '1d' -e '$d' *20140624.txtnon
1d deletes first line $d deletes last line.

-cmd to see the folder structure of a directory
ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^\/]*\//--/g' -e 's/^/ /' -e 's/-/|/'


- fist column of file not NRKM( | separated file)
cut -d'|' -f1 NRKM_CASES_RBWMCMB_DAILY_20131127.txt.1 | grep -v 'NRKM'




  • Files in human readable format like kb.mb,gb…etc
ls –lh


->to see control-M charactors in a file (^M)
Cat –v file.sh
perl -p -i -e "s/\r//g" file.sh

->to get 3rd fields of file whose records are separated by pipe
cut -d'|' -f3 <file nameNGTInvalidCustNullCust.bad> | uniq



- grep '23039451\|23039443\|23027887\|7104084178\|7104084111' ODS_CRM_ACCTS_DAILY_20140125.xml

In solaries ,
egrep '23039451|23039443|23027887|7104084178|7104084111' CRM_ACCOUNT_DAILY_20140129.xml

above one will work in all os’s


-- lines before and after grep match
---------------------------------
grep -B5 "the living" gettysburg-address.txt # show all matches, and five lines before each match
grep -A10 "the living" gettysburg-address.txt # show all matches, and ten lines after each match
grep -B5 -A5 "the living" gettysburg-address.txt # five lines before and ten lines after

http://alvinalexander.com/unix/edu/examples/grep.shtml
http://www.thegeekstuff.com/2011/10/grep-or-and-not-operators/






Ok, Assuming that your file is a text file, having the fields seperated by comma seperator ','. You would also know which field 'transactionid' is in terms of its position. Assuming that your 'transactionid' field is 7th field.
awk -F ',' '{print $7}' text_file | sort | uniq -c
awk -F '|' '{print $4}' NRKM_CASE_REL_DAILY_20140624.txt | sort | uniq -c |wc -l



specific fields with poipe separated:
awk -F'|' '{print $1"|"$1"|P|"}' CUSTOMERS_DAILY_20150317.txt|grep -v 'PSEUDO_'|head



sort and uniq records:
sort file | uniq



Add 2 pipes to end of each line
sed ' s/$/ |||/' mytest.txt>new.txt

sed -e 's/$/test added/' InFile >OutFile


Here are three options:
  • awk and its variants (gawk, mawk etc.):
  • awk '{if(NR==1){print $0,"| Place"} else{print $0,"| Paris"}}' file.txt
  • Perl:
  • perl -lne '$.==1 ? print "$_ | Place" : print "$_ | Paris"' file.txt
  • sed
  • sed '1 s/$/ | Place/; 1! s/$/ | Paris/' file.txt





-----Remove Control-M chars
Perhaps you have ctrl-M (carriage returns) at the end of each line in your file.
To View: cat -v /data/upgrade/upgrade_db.ksh
To Fix: perl -p -i -e "s/\r//g" /data/upgrade/upgrade_db.ksh



Unix Sed Tutorial: Find and Replace Text Inside a File Using RegEx



sed -i 's#trunc(systimestamp) - 18#trunc(systimestamp) - 11#g' filename

sed -i 's#trunc(systimestamp - 12)#trunc(systimestamp - 5)#g'

sed -i 's#QA2#QA4#g' filename





get no.of pipes of a record in a file
awk -F"|" '{print NF-1}' NGT_CUSTINFO_20141021.txt |sort -n | uniq –c



-send email from unix/linux
echo "something" | mailx -s "subject" kiran3kumar@hsbc.co.in

mailx -s "subject" kiran3kumar@hsbc.co.in <nrkm_sftp_test.txt

uuencode nrkm_sftp_test.txt nrkm_sftp_test.txt| mail kiran3kumar@hsbc.co.in -s "subject"

uuencode nrkm_sftp_test.txt nrkm_sftp_test.txt| mailx kiran3kumar@hsbc.co.in -s "llsubject"


Send an Email with Attachment and Body


( cat nrkm_body_test.txt; uuencode nrkm_sftp_test.txt nrkm_sftp_test.txt )| mail -s "Sub:Email With Body Text and Attachment" kiran3kumar@hsbc.co.in






Display common lines in both files:
perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/' file1 file2

or
comm -1 -2 /path/to/file1/ /path/to/file2
here file1 and file2 are in sorted order data.

For sorting use below
/path/to/file1 | sort > /path/to/file1_sorted
/path/to/file2 | sort > /path/to/file2_sorted
comm -1 -2 /path/to/file1_sorted/ /path/to/file2_sorted

Try
Only file1 elements
comm -2 -3 /path/to/file1_sorted/ /path/to/file2_sorted
only file2 elements
comm -1 -3 file1.txt file2.txt


-to see last line of multiple files
tail -n1 file1.txt file2.txt --last one line of files
tail -n2 file1.txt file2.txt --last 2 lines of files


--File operations

Data Masking in unix cmds:
b86471@usvh3euap42:/home/b86471> cat feedbeforemask.txt
1111|hsgsfr|tgfr|6hhgvcx|okhjg|hhhhgfgh||
After masking 3rd fields,
b86471@usvh3euap42:/home/b86471> awk -F'|' -v OFS='|' '{$3="9999"}1' feedbeforemask.txt
1111|hsgsfr|9999|6hhgvcx|okhjg|hhhhgfgh||

After masking 1st,4th,3rd fields oin a file,

b86471@usvh3euap42:/home/b86471> awk -F'|' -v OFS='|' '{$1="xxxxx"} {$4="9999"} {$3="9999"}1' feedbeforemask.txt
xxxxx|hsgsfr|9999|9999|okhjg|hhhhgfgh||

placing masked file into other file,
b86471@usvh3euap42:/home/b86471> awk -F'|' -v OFS='|' '{$1="xxxxx"} {$4="9999"} {$3="9999"}1' feedbeforemask.txt>feedaftermasking.txt
b86471@usvh3euap42:/home/b86471> cat feedaftermasking.txt
xxxxx|hsgsfr|9999|9999|okhjg|hhhhgfgh||


b86471@usvh3euap42:/home/b86471> awk -F'|' -v OFS='|' '{$2=$4=$3="9999"}1' feedbeforemask.txt
1111|9999|9999|9999|okhjg|hhhhgfgh||

Apply function of 2nd field
awk -F'|' -v OFS='|' '{$2=substr($2,2,3)}1' feedbeforemask.txt


b86471@usvh3euap42:/home/b86471> cat feedbeforemask.txt
1111|hsgsfr|tgfr|6hhgvcx|okhjg|hhhhgfgh||
b86471@usvh3euap42:/home/b86471> awk -F'|' -v OFS='|' '{sub(/h/, "cc", $2)}1' feedbeforemask.txt
1111|ccsgsfr|tgfr|6hhgvcx|okhjg|hhhhgfgh||
b86471@usvh3euap42:/home/b86471> awk -F'|' -v OFS='|' '{gsub(/1/, "2", $1)}1' feedbeforemask.txt
2222|hsgsfr|tgfr|6hhgvcx|okhjg|hhhhgfgh||
b86471@usvh3euap42:/home/b86471> awk -F'|' -v OFS='|' '{gsub("h", "z",$2)}1' feedbeforemask.txt
1111|zsgsfr|tgfr|6hhgvcx|okhjg|hhhhgfgh||

Replacing first and second column of feed
awk -F'|' -v OFS='|' '{gsub("1", "2", $1)}{gsub("h", "z",$2)}1' feedbeforemask.txt
2222|zsgsfr|tgfr|6hhgvcx|okhjg|hhhhgfgh||
----awk -F'|' '{print $1"|"$1"|P|"}' file >temp file
awk -F'|' '{print $1"|"substr($2,1,3)"|"$3"|"$4"|"$5"|"$6"|"}' CUSTOMER_ASSOC_INFO_DAILY_20150601.txt|head
------awk '$70==""{$70="PFS"}1' FS='|' OFS='|' testfile.txt
-awk '{ if (length($2) > max) max = length($2) } END { print max }' CUSTOMER_ASSOC_INFO_DAILY_20150601.txt


-----get specific line on huge file
# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files
sed '52q;d' CUSTOMERS_DAILY_20150127.txt.bkp
Assuming you need lines 20 to 40,
sed -n '20,40p' file_name




get no.of pipes of a record in a file
awk -F"|" '{print NF-1}' NGT_CUSTINFO_20141021.txt |sort -n | uniq –c

get the records which are having no.of pipes not equal to 15 (try NF-1 also)
awk -F'|' 'NF!=16 {print "Record",NR,"Fields count:",NF}' ASSOCIATED_NAME_DAILY_20150127.txt
get the records which are having no.of pipes equal to 15:
awk -F'|' 'NF==16 {print "Record",NR,"Fields count:",NF}' ASSOCIATED_NAME_DAILY_20150127.txt|wc -l

get recs which are not having no.of pipes 112
awk -F'|' 'NF-1!=112 {print "Record",NR,"Fields count:",NF-1}' CUSTOMERS_DAILY_20150127.txt.bkp
print record no and record of file which are having no.of pipes 3
awk -F'|' 'NF-1==3 {print NR,$0}' CUSTOMERS_DAILY_20150127.txt.bkp
or
awk -F'|' 'NF==4 {print NR,$0}' CUSTOMERS_DAILY_20150127.txt.bkp


NF=no.of fileds
NR=no.of Rows

Add Characters at the BEGINNING of Each Line

Use one of the following commands to remove blank lines from a file.
1. Using the grep command :
$ grep -v "^$" file.txt
2. Using the
sed command :
$ sed '/^$/d' file.txt
3. Using the
awk command :
$ awk '/./' file.txt
4. Using the
tr command :
$ tr -s '\n' < file.txt
You also can pipe the output of each command to a new file, like follows :
$ grep -v "^$" input.txt > output.txt

Print Lines Between Two Patterns with SED:
sed -n '/StartPattern/,/EndPattern/p' FileName


-change specific filed sof big file
I have a very big file (more than 10000 columns). I would like to change 3 entries in the second column and keep anything else the same, including the field separator
awk '$2==120{$2=1201}1' FS='\t' OFS='\t' file
awk '$1=="eCRM"{$1="CMB"}1' FS='|' OFS='|' testfile.txt
For multiple conditions just add more structures i.e:
awk '$2==120{$2=1201}$3==130{$3==1301}1'

awk -F'|' '$20=" "{print $0}' OFS='|' ACCOUNTS_DAILY_20150601.txt|wc -l



To see longest line of file:

wc -L testfile.txt
it will show longest line length only

below shows all
$ awk '{ print length(), NR, $0 | "sort -rn" }' tmp.txt
10 3 abracadabr
8 4 mu mu mu
7 2 tatatat
6 1 lalala
To show just the first line:
$ awk '{ print length(), NR, $0 | "sort -rn" }' tmp.txt | head -n 1
10 3 abracadabr



-The example mentioned below will print the lines 120, 145, 1050 from the syslog.
$ sed -n -e 120p -e 145p -e 1050p /var/log/syslog



-split big file with 1 GB(1024mb) files set
> split -b 1024m <file> <split-name>

> split -b 1024m reallybigfile smallfile
should give you files smallfileaa, smallfileab, smallfileac and smallfilead


split -b 10m file1.txt slitfile.txt (spilt into 10mb files set)





--password protected folder/file
zip -P password -r zzzz.zip zzzz.txt

zip -P password -r zzzz.zip zzzz (here zzzz is folder)

zip -P happyhsbc -r UAT4.zip UAT4