遗忘角落

Linux脚本-查找指定后缀的文件

最近下了很多音乐专辑,有.flac, .wav, .ape等,还好它们都关联一个同名的.cue文件。不过它们暂时都放在同一文件夹里面,就想着能否用脚本把每一张专辑放到各自文件夹里面。Window脚本我是不会写,Linux脚本也是才摸着门,幸好网络上资源丰富,于是有了以下的脚本

for f in *.cue do \

FILE=“”${f#*.}“”;\     # get all after last delimiter ‘.’

mkdir “FILE”;\

mv “FILE.*” “FILE”;\

done

 

参考资料

1.Loop over all files in all sub-directories

 

# show all .fasta files in all sub-directories of folder ‘assembly’
find assembly/ -name "*.fasta"
assembly/projectA/run1.fasta
assembly/projectA/run2.fasta
assembly/projectB/run1.fasta
assembly/projectB/run2.fasta
assembly/final.fasta

# search for ‘plasmid’ in all fasta files (run command  grep  on all files) in folder ‘assembly’

find assembly/ -name "*.fasta" -exec grep plasmid {} \;

# compress all .fasta files using gzip, include all sub-directories of folder ‘assembly’

find assembly/ -name "*.fasta" -exec gzip {} \;

General

find/path/-name  filter files   -exec   run command         {} \;  -name  # filter filenames
  -exec  # run command

   {}    # on each file
   \;# pass files one by one to the command
————————————————————————————————————

2.STRING SPLIT

Split string of variable ${FILE} by delimiter ‘_

# filename example

FILE=SRR01234_mapped_ecoli.txt

echo ${FILE%%_*}  # get all before first delimiter ‘_’
SRR01234

echo ${FILE%_*}   # get all before last delimiter ‘_’
SRR01234_mapped

echo ${FILE##*_
# get all after last delimiter ‘_’
ecoli.txt

echo ${FILE#*_}  
# get all after first delimiter ‘_’
mapped_ecoli.txt

Basename examples

# get filename without path prefix
F=/path/to/sample/SRR01234_mapped_ecoli.txt
FILE=`basename ${F}` 
echo ${FILE}
SRR01234_mapped_ecoli.txt


# get filename without path prefix and without extension
F=/path/to/sample/SRR01234_mapped_ecoli.txt
FILENAME=`basename ${F%%.*}`  # get all before first ‘.’ (remove file ending)
echo ${FILENAME}
SRR01234_mapped_ecoli
# get sample-ID
F=/path/to/sample/SRR01234_mapped_ecoli.txt
SAMPLE=`basename ${F%%_*}`  # get all before ‘_’
echo ${SAMPLE}
SRR01234

# get speciesname “ecoli”

F=/path/to/sample/SRR01234_mapped_ecoli.txt
FILENAME=`basename ${F%%.*}`  # get all before first ‘.’ (remove file ending)
SPECIES=${FILENAME##*_}  # get all after last delimiter ‘_’
echo ${SPECIES}
ecoli

Alternative use: cut

# get first word (-f 1) based on delimiter ‘_’
FILE=SRR01234_mapped_ecoli.txt

SAMPLE=$( cut -d '_' -f 1 - <<< "${FILE}" )
echo ${SAMPLE}
SRR01234
—————————————————————————————————————————————

3.Loop over list of files

Run a command on each file

do something (echo) with all .txt files
for f in *.txt;  do echo ${f}; done;
same, but using a pipe (reading from standard input), and a while-loop
ls *.txt | while read f; do echo ${f}; done;
do something with a small set of files 
for f in file1.txt file2.txt file3.txt; do echo ${f}; done;

file1.txt
file2.txt
file3.txt

same, but separately defined list of files
filelist="file1.txt file2.txt file3.txt"
for f in ${filelist}; do echo ${f}; done;
reading list of files from file ‘filelist.txt’ (for larger number of files)
ls *.csv > filelist.txt    # define your list of files
for f in `cat filelist.txt`; do echo ${f}; done;
if a line may include spaces better use a while loop:
cat filelist.txt | while read LINE; do echo "${LINE}"; done

loop over filenames without extension, see → basename string split
for f in Data/*.txt; do FILENAME=${f%%.*};  echo ${FILENAME};  done;

Data/fileA .txt
Data/fileB .txt

loop over filenames without extension and without path prefix
for f in Data/*.txt ; do FILENAME=`basename ${f%%.*}`; echo ${FILENAME}; done

Data/ fileA .txt
Data/ fileB .txt

exclude samples that are already processed
process input files Data/*.fastq only if result-files Result/*.txt does not exist
for f in Data/*.fastq; do \
  SAMPLE=`basename ${f%%.*}`; \
  if [ ! -f Results/${SAMPLE}.txt ]; then \
     echo "processing sample ${SAMPLE}"; \
     # do something \
  fi; \
done

# do something 10 times
for N in {1..10}; do \
   echo ${N}; \
done

 

# double loop over a series of numbers and letters
for N in {1..5}; do
  for S in {A..C}; do
    echo ${N} ${S};
  done;
done;
1 A
1 B
1 C
2 A
2 B
2 C
°°°
general use of:

  if then else  (-f  check if normal file)
if [ -f /path/to/${SAMPLE} ];
then

# do something if file exist

else
   # do something if file does not exist

fi

Examples

search string ‘ABC’ in all text files
for f in *.txt ; do echo --${f}-- ; grep "ABC" ${f} ; done
copy all files in filelist.txt to a newDir/
ls *.csv > filelist.txt    # define your list of files
mkdir newDir  # create the new directory
for f in `cat filelist.txt`; do echo copying ${f}; cp ${f} newDir/; done;
add path if files are not located in working directory
for f in `cat filelist.txt`; do echo copying ${f}; cp path/to/files/${f} newDir/; done;

4. 单引号双引号区别

1)、单引号属于强引用,它会忽略所有被引起来的字符的特殊处理,被引用起来的字符会被原

封不动的使用,唯一需要注意的点是不允许引用自身;

 

2)、双引号属于弱引用,它会对一些被引起来的字符进行特殊处理,主要包括以下情况:

1:$加变量名可以取变量的值 ,比如:

[root@localhost ~]# echo ‘$PWD’
$PWD  

[root@localhost ~]# echo “$PWD”

/root

2:反引号和$()引起来的字符会被当做命令执行后替换原来的字符,比如:

[root@localhost ~]# echo ‘$(echo hello world)’
$(echo hello world)
[root@localhost ~]# echo “$(echo hello world)”
hello world

[root@localhost ~]# echo ‘`echo hello world`’
`echo hello world`
[root@localhost ~]# echo “`echo hello world`”
hello world

3:当需要使用字符($  `  ”  \)时必须进行转义,也就是在前面加\ ;

[root@localhost ~]# echo ‘$ ` ” \’
$ ` ” \
[root@localhost ~]# echo “\$ \` \” \\”
$ ` ” \

 

Leave a Reply

Your email address will not be published. Required fields are marked *