I have a folder with 100s of CSV files. What I want is a batch file to process each file by looping 1 by 1 and delete any duplicate rows under it.
Other than The duplicate lines there are some rows which I want to delete based on 2nd columns value.
Name,Test Name
City,Mumbai
Country,IN
I want to delete rows where second column of the row matches IN
Is it possible to do so with a windows batch script?
The following untested batch script does the job assuming
1) the duplicate rows to be deleted are in sequence;
2) the key field for deletion starts as ,IN
@echo off & setlocal EnableDelayedExpansion
pushd Your_Folder
for %%i in (*.csv) do (
set row=
for /F “delims=” %%j in (‘type “%%i” ^| find /V “,IN”‘) do (
if not “%%j”==”!row!” echo.%%j>> “%%~ni.tmp”
set row=%%j
)
)
del *.csv
ren *.tmp *.csv
popd
:: ===== script starts here =============== :: get out dups & IN in column 2 :: rubal.bat 2013-02-27 17:27:30.37 @echo off > newfile & setLocal enableDELAYedeXpansioN for /f "tokens=* delims= " %%x in ('dir/b *.csv') do ( type nul > newfile for /f "tokens=* delims= " %%o in (%%x) do ( find "%%o" < newfile > nul || >> newfile echo.%%o ) copy newfile %%x > nul ) del newfile for /f "tokens=* delims= " %%c in ('dir/b *.csv') do ( if exist # del # for /f "tokens=* delims= " %%a in (%%c) do ( set S=%%a & set Z=!S:,= ! call :sub1 !Z! if defined S >> # echo.!S! ) copy # %%c ) del # goto :eof :sub1 if '%2' equ 'IN' set S= goto :eof ::====== script ends here =================