Saturday, 24 January 2009

Mass file renaming.

Thankfully: A civilised way:

Rename all .xcf.png files, with .png:

rename 's/.xcf.png/.png/' ????.xcf.png



Or, the hard way:

Copied directly from http://www.linuxquestions.org/blog/ghostdog74-290378/2008/9/17/simple-mass-file-renamer-and-deleter-python-1190/
by 'ghostdog74':

This can:

1) Change upper case to lower case and vice versa
2) Change file names by pattern substitution
3) Change file names by number sequence
4) Insert pattern in front of files
5) Insert pattern at the end of files.
6) Simple sort by number.
7) Manual revert of changed files.
8) Deletion of files by pattern match only.
9) Ability to recurse directory
10) Inverse match pattern, like grep -v
11) Case sensitivity search
12) Remove string by character positions. eg 0:10 => remove from start to 10th character.



#!/usr/bin/env python
import sys,os,getopt,glob,string,re,operator,fnmatch,time

################################## FUNCTIONS #################################################
def usage(name):
print """
%s
Mini File Renamer and Deleter v1.0.2 - Copyright 2008
Author: ghostdog74

%s
usage: %s [-h] [-D dir] [-t f|d ][-[s|p] oldpat] [-e newpat ] [ -i pattern ][ -b patten ] [-d depth] [-Z|-v|-n|-r|-I] [-l] [files]
-D Directory to start rename/delete. Use quotes for directories with spaces
If no directory is specified, current working directory is assumed
Eg: Use -D c:\\test\\ (for Windows, end with trailing back slash)
-h Prints help page
-t Type of files. Either files(f) or directories(d). Default is files, so can omit -t f
-s Sequence substitution. Specify pattern to be substituted. Must be specified with -e.
-p Pattern substitution. Must be specified with -e (pattern to change to)
eg.
1) To change whole jpeg file name to upper case ==> -p ".*" -e "A-Z" "*.jpeg"
2) To change the word "TEST" to lower case in jpeg file ==> -p "TEST" -e "a-z" "*.jpeg"
3) To remove all numbers from directory name ==> -p "[0-9]+" -e '' -t d -l "*"
4) To remove first character from file/directory ==> -p "^." -e '' -l "*.jpeg"
5) To remove special characters ==> -p "[\"^']" -e "back" "*"

-e * When used with -s, indicates ending sequence pattern. Can include alphanumberic. Must use ":" to specify range
Eg
1) -s "test" -e "01:10" will replace 'test' in files from '01' to '10'. If more than 1 files with 'test',
will go by sequence, ie '02' , '03' etc.
2) -s "test" -e "###01:11@@@" will replace 'test' from '###01@@@' , followed by '###02@@@' and so on to ###11@@@
* When used with -p, indicates pattern to change to.
* When used with -c, specify -e "[A-Z]" to change to uppercase, -e "[a-z]" to change to lower case.

-i Insert pattern to infront of file name.
-b Insert pattern to back of file name.
-c Remove characters in file name by position, position index start from 0. Always use -l to verify files to be changed.
eg
1) -c 1 ==> remove 2nd character
2) -c -1 ===> remove last character.
3) -c -2: ===> remove from last 2nd character onwards.
4) -c 3: ==> remove from 4th character onwards
5) -c 4:10 ==> remove from 5th to 10th character
6) -c :3 ==> remove from start to 3rd character.
7) -c 1:3 -e "[A-Z]" ==> change positions specified to uppercase

-l List all files with pattern only. No renaming. Useful for verifying what will be changed.
-r Used alone. Enable restoration of previous commands. eg %s -r
-n Simple Numerical sort. Specify -n to turn numerical sorting on. Only works on files of the same structure.
-d # of directories down to do rename (default = 0). ie: Directory depth level
-I Case insensitive pattern search. eg -p "rot" -e "" -I -l "*.txt". Find rot,ROT,roT,RoT ..
-v Wildcard pattern reversal. eg -v "*.bat" : Files that doesn't end with .bat.
-Z Do deletion of files.
eg -d 4 -Z ".*01*" -l -v "*.txt" ==> delete all files that doesn't end with .txt and with the pattern "01" in the
file name, 4 levels deep into current directory
[files] List of files to be renamed/deleted. Can have wildcards. eg test*.txt
To specify all files ==> "*". Will not work if not specified.
""" % ("=" *100 , "=" * 100 ,name,name )


def pathChecker(path):
''' Function to determine correct path or file exists
and then returns the number of count of the path separator
as an indicator of the depth of the path.
'''
if os.path.exists(path):
pathcount=path.count(os.sep)
if pathcount > 1:
return int(pathcount) , path , 0
else:return 1,1,1
else:
return 0,0,-1

def combocheck(s,k):
'''Function to check the options user keyed in against a set of bad options
Input : s => bad options
k => list of user supplied keys...eg -S -s -N..etc
'''
v = len(s); t = 0 ##store lenght of predefined bad options, t=0 to count matched bad ops
for x in s:
if x in k:
t = t + 1
## if all bad options found
if v == t : return True
else: return False


def doWalk(DIR=None,maxdepth=1,TYPE="f",ACTION="sub",patold=None,patnew=None,DEBUG=1,INVERSE=0,CASE=1,SORT=0,FileNamesArgs=None):
''' Traverse the directory specified until depth level,
looking for files with the correct patterns and rename them
accordingly
'''
GlobbedTypeList=[]
AllFilesGlobbed=[]


# convert into reg expression syntax in order to search. eg "*.txt" to ".*txt$"

regex = fnmatch.translate(FileNamesArgs)
reobj = re.compile(regex)
for ROOT,DIRECTORY,FILES in os.walk(DIR,True):
#do for files less or equal to maxdepth
if ROOT.count(os.sep) <= int(maxdepth): if FileNamesArgs is not None: if INVERSE: for FI in os.listdir( ROOT ): if not reobj.search(FI): AllFilesGlobbed.append( os.path.join(ROOT,FI )) else: try: allfiles = glob.glob(os.path.join(ROOT,FileNamesArgs)) except Exception,e: pass else: if allfiles: for found in allfiles: if found not in AllFilesGlobbed: AllFilesGlobbed.append(found) if TYPE=="d": for dirname in AllFilesGlobbed: if os.path.isdir(dirname): if not dirname in GlobbedTypeList: GlobbedTypeList.append([dirname,dirname.count(os.sep)]) else: for filenames in AllFilesGlobbed: if os.path.isfile(filenames): if not filenames in GlobbedTypeList: GlobbedTypeList.append(filenames) if SORT==1: GlobbedTypeList = sorted_copy(GlobbedTypeList) else: GlobbedTypeList=sorted(GlobbedTypeList, key=(operator.itemgetter(0))) # do various actions doAction(GlobbedTypeList,TYPE,ACTION,patold,patnew,CASE,SORT,DEBUG) def brake(): raw_input("Enter") def clearscreen(rows): for i in range(rows): print def rename(FROM,TO="",DEBUG=1): if FROM == TO:return if DEBUG==0 : if TO : try: os.rename(FROM,TO) except Exception,e: print "Error : ",e else: print FROM , " is renamed to ", TO # store to restore file. simple mechanism. Use pickle/shelve ?? open(restorefile,"a").write("""%s,%s\n""" %( TO,FROM )) elif not TO : print "Deleting " ,FROM if os.path.isdir(FROM): try: os.removedirs(FROM) #or use os.rmdir except Exception,e: print "Error: ",e elif os.path.isfile(FROM): try : os.remove(FROM) except Exception,e: print "Error: ",e else: print "==>>>> ", "[" ,FROM ,"]==>[",TO,"]"

def changecase(patnew,thefile):
# if -c and -e option and -e [A-Z] or -e [a-z]
try:
foundlowercase = re.findall( "\[a-z\]|a-z" , patnew)[0]
except:
try:
founduppercase = re.findall( "\[A-Z\]|A-Z" , patnew)[0]
except: notfound=1
else:
newname = thefile.replace(thefile,thefile.upper())
else:
newname = thefile.replace(thefile,thefile.lower())


def doAction(FILES,TYPE="d", ACTION="sub", patold="" ,patnew="", CASE=1, SORT=0,DEBUG=1):
'''Function to do sequence substitution
Input: FILES => A list of globbed files to be processed.
TYPE => f = files, d = directories
patold => The pattern in the file to be substituted
patnew => The new pattern to replace the old
DEBUG => 1 : Do a listing only
0 : Do substitution, and renaming of files
'''

notfound=0 #flag for doing case change.
if TYPE=="d":
# if renaming directories, rename from the last(highest) level. So have to sort according to maxdepth
FILES=sorted(FILES, key=(operator.itemgetter(1)),reverse=True)

if CASE == 0:
patold_re = re.compile(patold,re.IGNORECASE)
elif CASE:
patold_re = re.compile(patold)

if ACTION=="seq":
# see if in format 001:020 ...this indicates sequence
patnew_re =re.compile("(\w*\d*\D*)(\d+)[:](\d+)(\D*\w*\d*)")
seq = patnew_re.findall(patnew)[0]
startseq=seq[1]; endseq=seq[-2]
patendseqfront = seq[0]
patendseqback = seq[-1]
if endseq and int(endseq) < endseq =" endseq,startseq" endseq="startseq" endseq ="="" endseq="len(FILES)" length_startseq =" len(str(startseq))" type="="" fn="fn[0]" type="="" fn="fn" thefile =" os.path.split(FN)" action="="" patnew ="="" newname =" patold+thefile" patnew ="="" newname =" thefile+patold" action="="">:
patdigit=re.compile("(-)*(\d*):(-)*(\d*)")
b= list(thefile)
if patnew and re.search( "\[a-z\]|a-z" , patnew) :
caseflag=1
elif patnew and re.search( "\[A-Z\]|A-Z" , patnew):
caseflag=2
else:caseflag=0

# if single digit
if ":" not in patold and ( int(patold) < caseflag="="1" caseflag="="2:" newname =" ''.join(b)" foundit =" patdigit.findall(patold)[0]" first = "" second = "" caseflag="="1:" newname =" thefile[0:int(first)]+thefile[int(first):].lower()" caseflag="="2:" newname =" thefile[0:int(first)]+thefile[int(first):].upper()" newname =" thefile[" caseflag="="1:" newname =" thefile[:int(second)].lower()+thefile[int(second):]" caseflag="="2:" newname =" thefile[:int(second)].upper()+thefile[" newname =" thefile[" caseflag="="1:" newname =" thefile[0:int(first)]+thefile[int(first):int(second)].lower()+thefile[int(second):]" caseflag="="2:" newname =" thefile[0:int(first)]+thefile[int(first):int(second)].upper()+thefile[int(second):]" newname =" ''.join(b)" action="="" repl=" patendseqfront+str(startseq).zfill(length_startseq)+patendseqback" newname =" re.sub(patold,repl,thefile)" n="="">Number of files less than ending sequence number...Exiting.."
break
#increment sequence
startseq = int(startseq) + 1
elif ACTION=="sub":

#changing case. changing all filename to upper/lower case.
try:
foundlowercase = re.findall( "\[a-z\]|a-z" , patnew)[0]
except:
try:
founduppercase = re.findall( "\[A-Z\]|A-Z" , patnew)[0]
except: notfound=1
else:
newname = thefile.replace(thefile,thefile.upper())
else:
newname = thefile.replace(thefile,thefile.lower())

if notfound:
newname = patold_re.sub(patnew,thefile)
rename(FN, os.path.join(thepath,newname),DEBUG )
elif ACTION=="delete":
rename(FN, None ,DEBUG )

# Taken from Python recipe
def sorted_copy(alist):
# http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52234
indices = map(_generate_index, alist)
decorated = zip(indices, alist)
decorated.sort()
return [ item for index, item in decorated ]

def _generate_index(str):
"""
Splits a string into alpha and numeric elements, which
is used as an index for sorting"
"""
#
# the index is built progressively
# using the _append function
#
index = []
def _append(fragment, alist=index):
if fragment.isdigit(): fragment = int(fragment)
alist.append(fragment)

# initialize loop
prev_isdigit = str[0].isdigit()
current_fragment = ''
# group a string into digit and non-digit parts
for char in str:
curr_isdigit = char.isdigit()
if curr_isdigit == prev_isdigit:
current_fragment += char
else:
_append(current_fragment)
current_fragment = char
prev_isdigit = curr_isdigit
_append(current_fragment)
return tuple(index)

#--------------------------------------END Functions -------------------------#

## these are options not allowed
bad_options = [ ['-s','-p'],
['-Z','-p'],['-Z','-s'],['-Z','-c'],
['-c','-b'],['-c','-i'],['-c','-p'],['-c','-s'],
#can add somemore...
]

################################## END FUNCTIONS ##################################################

if __name__ == '__main__':

basename = os.path.basename(sys.argv[0])

# create restore directory if doesn't exists
restorepath = ".restore"
if not os.path.exists(restorepath): os.mkdir(restorepath,777)
# design the restoration filename
TIME=list(time.localtime())
# get month and day into double digits string
if len( str(TIME[1]) ) < time="'-'.join(map(str,TIME))" restorefile =" os.path.join(" debug="1" inverse="0" type="f" action="sub" case="1" sort="0" args =" getopt.gnu_getopt" args =" None" function ="args[0]" filenames="args[-1]" opts ="="" options =" dict(opts)" keys =" options.keys()" h="{};rh="{}" num="num+1" choice =" raw_input(" ofilename =" h[choice]" ofilename ="="" to="olines.strip().split(" n="n+1">>> %s" %( k,rh[k][0],rh[k][1])
restoreyesno = raw_input( "Continue to restore [y|n]?: ")
if restoreyesno in ["n","N"]: break
elif restoreyesno in ["y","Y"]:
rchoice = raw_input("Enter choice to restore: ")
for n, olines in enumerate(open( os.path.join(restorepath,ofilename))):
From,To = olines.strip().split(",")
n=n+1
if "All" in rh[int(rchoice)] or "Original" in rh[int(rchoice)] :
try:
os.rename(From,To)
except Exception,e:
print "Error restoring: ",e

elif n==int(rchoice):
try:
os.rename(From,To)
except Exception,e:
print "Error restoring: ",e

# directory key
if options.has_key('-D') and options['-D'] != []:
depthcnt ,newpath, ret = pathChecker(options['-D']) #check the root, whether exists
if ret == -1:
print "%s does not exists. " % (options['-D'] )
sys.exit(2)
options['-D'] = newpath
else :
# if not -D specified, take current directory
depthcnt ,newpath, ret = pathChecker(os.getcwd())
options['-D'] = os.getcwd()

DIR=options['-D']



# check bad options...
for bad_ops in bad_options :
if combocheck( bad_ops, keys) :
usage(basename)
sys.exit(0)

# Check for maxdepth
if options.has_key('-d'):
maxdepth = int(options['-d']) + depthcnt
else:
maxdepth = depthcnt

# check for list flag - debug mode , 0 for commit.
if not options.has_key('-l'): DEBUG = 0

# check for inverse pattern search,similar to grep's -v
if options.has_key('-v'): INVERSE = 1

# check numerical sorting
if options.has_key('-n'): SORT = 1

#check file type, whether search file or directory
if options.has_key('-t'):
TYPE=options['-t']

if options.has_key('-i') :
patold=options['-i']
patnew="front"
ACTION="insert"

#insert at back of file name
if options.has_key('-b') :
patold=options['-b']
patnew="back"
ACTION="insert"

if options.has_key('-c') and options.has_key('-e') :
patold=options['-c']
patnew=options['-e']
ACTION="char"
elif options.has_key('-c') and not options.has_key('-e'):
patold=options['-c']
patnew=""
ACTION="char"

if options.has_key('-s') and options.has_key('-e') :
patold=options['-s']
patnew=options['-e']
if patnew == "" : usage(basename); sys.exit(1)
ACTION="seq"
elif options.has_key('-s') and not options.has_key('-e'):usage(basename) ;sys.exit()


if options.has_key('-p') and options.has_key('-e') :
patold=options['-p'];patnew=options['-e']
elif options.has_key('-p') and not options.has_key('-e'): usage(basename) ;sys.exit()

if options.has_key('-v'):
INVERSE=1

# ignore case sensitivity
if options.has_key('-I'): CASE=0

if options.has_key('-Z'):
patold=options['-Z']
if not patold : usage(basename);sys.exit(1)
patnew=""
ACTION="delete"

# do the walking.
try:
doWalk(DIR,maxdepth,TYPE,ACTION,patold,patnew,DEBUG,INVERSE,CASE,SORT,FileNames)
except:
usage(basename)
sys.exit()


With examples:

Quote:
Code:
Some examples:
1) Changing cases:
a) Change case of 2nd to 3rd character of all files starting with "test" to upper case.
---> filerenamer.py -c 1:3 -e "[A-Z]" -l "test*"
b) Change case of 2nd till last 4th character of all files to lower case
---> filerenamer.py -c 2:-4 -e "[a-z]" -l "*"
c) Change file name to all upper case
---> filerenamer.py -c 0: -e "[A-Z]" -l "*"

2) Change file names by substitution:
a) Change the word "test" to "foo" for all files starting with "test"
---> filerenamer.py -p "test" -e "foo" -l "test*"

3) Change file names to a number sequence
a) Change the word "test" in files starting with test to number sequence 001 to 100
---> filerenamer.py -s "test" -e "001:100" -l "test*"
b) Change the word "test" in files starting with test to "foo" and number sequence 001 to 100
---> filerenamer.py -s "test" -e "foo001:100" -l "test*"
* changes testbar.txt to foo001bar.txt
c) Change the word "test" in files starting with test to number sequence 001 to 100 followed by "foo"
---> filerenamer.py -s "test" -e "01:20foo" -l "test*"
* changes testbar.txt to 001foobar.txt

4) Removing characters by position
a) Remove the first 5 characters for files starting with "test"
---> filerenamer.py -c 0:5 -l "test*"
b) Remove the last 5 characters for files starting with "test"
---> filerenamer.py -c -5: -l "test*"
c) Remove 2nd character from all files
---> filerenamer.py -c 1 -l "*"

NB: uses the python indexing convention.
and my examples:

No comments:

Post a Comment