coldfusioncoldfusion-7

match a string in all the files present in a directory and return its total count


i need a way to search for total no occurrences of any perticular string in all files. For eg the total count of occurrence of 'ABC' in all the files. Earlier, I have a code to do this on single file at a time:

    <cffile action="read"
        file="full_Path\file.txt"
        variable="filecontent">
    <cfset charList = "strings to match/search">
    <cfoutput> 
    <cfloop list="#charList#" index="x"> 
        <cfset charCount = val(len(filecontent) - len(replace(filecontent,x,"","all")))/Len(x)> 
        Count of '#htmlEditFormat(x)#' = #charCount#<br> 
    </cfloop>
    </cfoutput>

i have got some new requirements to this problem. I need to get the result in tabular format so that i can export it to excel sheet. I tried doing this:

<cfquery name="getname" dbtype="query">
Select Name,Size from Files
</cfquery> 
<cfset myArray = ArrayNew(1)>
<cfset myArray1 = ArrayNew(1)>
<cfset myArray2= ArrayNew(1)>
<cfset charList = "list of strings">
<cfloop list="#charList#" index="x"> 
    <cfset stringCounts[x] = 0>
</cfloop>

<cfoutput query="Files">
        <cffile action="read"
            file="#Files.directory#\#Files.name#"
            variable="filecontent">

      <cfloop list="#charList#" index="x">

            <cfset stringCounts[x] = stringCounts[x] + val(len(filecontent) - len(replace(filecontent,x,"","all")))/Len(x)>          

          <cfset ArrayAppend(myArray1, #Files.name#)>
          <cfset ArrayAppend(myArray2, #x#)>
          <cfset ArrayAppend(myArray, #stringCounts[x]#)>           
        </cfloop>
</cfoutput>

<cfset Qryalldata =Querynew("")>
<cfset row1= QueryAddcolumn(Qryalldata,"FileName", myArray1)>
<cfset row2= QueryAddcolumn(Qryalldata,"Counta", myArray)>
<cfset row3= QueryAddcolumn(Qryalldata,"Tags", myArray2)>

<cfquery name="Result" dbtype="Query">
   Select FileName,Tags,Counta from Qryalldata
</cfquery>
<cfdump var="#Result#">

Result is like

FileName     Tags       Counta
File1       CFquery       2
File1       CFIf          1
File1       CFElse        1
File2       CFquery       3
.
.
.

How to format this output Like

Name of File    Size    count of CFQuery    count of CFIF     count of CFElse  etc

Solution

  • ok, so what you want to do is loop over all the files that you get from a cfdirectory call. You may want to build in some logic to check for particular filetypes only (or this can be covered by the filter attribute in cfdirectory).

    And as you're counting occurences of a list of possible strings, we need to have multiple counters for each. There's various ways you could store that information in variables, I'm going to suggest a struct. So if you're looking for the count of the strings "foo" and "bar", i'm suggesting you will end up with a structure which looks like:

    {
        'foo' = 100,
        'bar' = 77
    }
    

    And here's how I'd do it. I populate the structure with zeroes for each string you're searching for first, then increment it while looping over the files. I'm assuming your code that counts the number of instances of the search terms is good, I haven't looked at it too closely.

    <cfset charList = "foo,bar">
    <cfset filetypes = arrayNew(1)>
    <cfset arrayAppend(filetypes, "js")>
    <cfset arrayAppend(filetypes, "cfm")>
    
    <cfset stringCounts = structNew()>
    
    <cfloop list="#charList#" index="x"> 
        <cfset stringCounts[x] = 0>
    </cfloop>
    
    <cfloop index="i" from="1" to="#arrayLen(filetypes)#">
        <cfdirectory
            action="list"
            directory="your directory" 
            name="Files"
            recurse = "yes"
            filter="*.#filetypes[i]#" />         
    
        <cfloop query="Files">
            <cffile action="read"
                file="#Files.directory#\#Files.name#"
                variable="filecontent">
    
            <cfloop list="#charList#" index="x"> 
                <cfset stringCounts[x] = stringCounts[x] + val(len(filecontent) - len(replace(filecontent,x,"","all")))/Len(x)> 
                <cfoutput>#Files.directory#\#Files.name# : count of '#htmlEditFormat(x)#' = #stringCounts[x]#<br></cfoutput>
            </cfloop>
        </cfloop>
    </cfloop>
    
    <cfloop collection="#stringCounts#" item="x"> 
        <cfoutput>Count of '#htmlEditFormat(x)#' = #stringCounts[x]#<br></cfoutput>
    </cfloop>