Pharlap
08-04-2007, 05:23 PM
Hi Folks of the world VBA community
So from the begining .... here is the CHALLENGE should you choose to help me.
I have searched the many forum solutions for duplicates however the ones I found relate to a single column. However what I would like to be able to do is starting from row 8 is to remove duplicates of a mulitple of columns of a single worksheet exmple attached - namely columns P, R, S, T, and U. Note some cells in each column are blank as per the example attached to this thread, also there may be more than one duplicate in a column and not always right underneath the original -could be anywhere in the column. Also it is important to retain/keep values that have not been duplicated.
To clarify - it would firstly check for duplicates in column P, keeping just one of the rows of the duplicated value and removing entiely the row/rows of where the duplications appeared. And also retaining rows where values in the column have not be duplicated. Then go through each of the other four columns. of the worksheet doing the same for each of the other four columns.
Then after the duplicates are removed from all five columns provide a tally at the bottom of the worksheet of how many times a value now appears in the cells of each of repective columns.
Perhaps it would be good to produce the tally by worker of a total of how many times a values appears in a column that that still remains after rows are removed perhaps output the tally to new worksheet in the same format and retain the original worksheet. see below for more detail.
So far someone has helped me with the first part being the removing the duplicates part but I would like to add to it the other things that I have since thought I would like to also do which are also listed below.
This code works great for removing the duplicates
Sub test()
Dim a, i As Long, txt As String, z As String
Dim x As Integer
Dim ws As Worksheet
Application. ScreenUpdating = False
Set ws = Worksheets("SHEET1")
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For x = 16 To 21
Again:
a = ws.Range("A65536").End(xlUp).Row
For i = 5 To a
z = ws.Cells(i, x).Value
If Not z = "" Then
If Not .exists(z) Then
.Add z, Nothing
Else
txt = txt & "," & Cells(i, 1).Address(0, 0)
If Len(txt) > 245 Then
Range(Mid$(txt, 2)).EntireRow.Delete
txt = "": .RemoveAll: Goto Again
End If
End If
End If
Next
If Len(txt) Then Range(Mid$(txt, 2)).EntireRow.Delete
txt = ""
Next x
End With
End Sub
The other things I would like to be able to acheive is
1. for the code to run on any tab name - that is to say the code auto picks up the tab name it is run on rather than me having to rekey different tab names into the code for different worksheet names. NOTE there is only ever one worksheet in the workbook but each time I extract the data from the system the tab name is different so in the code to remove dups I hve so far I ahve to manually change the worksheet name...I would prefer that the code auto detect that if that is possible.
2. Also I have thought it would be good if the code above could also delete some other rows but these arent duplicates - so perhaps a second piece of code to also delete those other rows where ;
-in column G if in any part of the name string contains word "CLASS" (note the sting can contain a upt to ten words and be up to 70 characters in totol lenght ) and there is "X" in same row in any columns of I, J, K and/or IL - then delete the row so long as NO value is present in columns P, R, S, T, and U on the same row - as I want to keep those rows. In attached the test file but I have added to it some test data for this part of it - there are 6 new rows at record numbers 5 to 11 - of these 6 new rows file are highlighted in blue and one in orange - if the code works well - as also doing what the first piece of code did to remove all duplicates from columns P, R, S, T and U the new code would now also be able delete the rows indicated by blue and leave the orange row - of course in the real files the rows would not be coloured.
3. that AFTER ALL require deletions of rows have been completed the final thing is to provide a tally for each worker in column E - of how many rows they have with values appearing in columns I, J, K, L and P, R, S, T, U - now I do have some code as below to this which some one else helped me previosly with for another workbook -which I think with some modification might be able be intigrated into the code required for this overall exercise however I havent been able to achieve it-what do you think ?
Sub test()
Dim a As Variant
Dim b() As Variant
Dim NewWs As Worksheet
Dim j As Integer, k As Integer, i As Integer, ii As Integer
a = Range(Range("IV" & 7).End(xlToLeft), Range("B" & Rows.Count).End(xlUp)).Value
Redim b(1 To UBound(a, 1), 1 To UBound(a, 2))
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For i = 1 To UBound(a, 1): k = 1 'Reset Column Counter
If Not IsEmpty(a(i, 1)) And Not .exists(a(i, 1)) Then
j = j + 1: .Add a(i, 1), j
b(j, 1) = a(i, 1) ' Add Header
End If
For ii = 5 To UBound(a, 2)
Select Case ii
Case 5, 6, 7, 8, 12, 14, 15, 16, 17
k = k + 1
If i = 1 Then b(1, k) = a(i, ii)
If Not a(i, ii) = "" And Not i = 1 Then b(.Item(a(i, 1)), k) = b(.Item(a(i, 1)), k) + 1
End Select
Next ii
Next i
End With
Set NewWs = Worksheets.Add
NewWs.Name = "WORKER TALLY"
NewWs.Range("A1").Resize(j, k).Value = (b)
With NewWs.Range("A1", NewWs.Range("IV" & 1).End(xlToLeft))
.Font.ColorIndex = 6
.Interior.ColorIndex = 11
.Font.Name = "Arial"
.Font.Bold = True
.WrapText = True
.Rows.AutoFit
End With
NewWs.Columns(1).AutoFit
End Sub
I am only a beginner with VBA so your help would be greatly appreciated, plus the result code of course can be shared on this forum.
Regards
Pharlap :-)
So from the begining .... here is the CHALLENGE should you choose to help me.
I have searched the many forum solutions for duplicates however the ones I found relate to a single column. However what I would like to be able to do is starting from row 8 is to remove duplicates of a mulitple of columns of a single worksheet exmple attached - namely columns P, R, S, T, and U. Note some cells in each column are blank as per the example attached to this thread, also there may be more than one duplicate in a column and not always right underneath the original -could be anywhere in the column. Also it is important to retain/keep values that have not been duplicated.
To clarify - it would firstly check for duplicates in column P, keeping just one of the rows of the duplicated value and removing entiely the row/rows of where the duplications appeared. And also retaining rows where values in the column have not be duplicated. Then go through each of the other four columns. of the worksheet doing the same for each of the other four columns.
Then after the duplicates are removed from all five columns provide a tally at the bottom of the worksheet of how many times a value now appears in the cells of each of repective columns.
Perhaps it would be good to produce the tally by worker of a total of how many times a values appears in a column that that still remains after rows are removed perhaps output the tally to new worksheet in the same format and retain the original worksheet. see below for more detail.
So far someone has helped me with the first part being the removing the duplicates part but I would like to add to it the other things that I have since thought I would like to also do which are also listed below.
This code works great for removing the duplicates
Sub test()
Dim a, i As Long, txt As String, z As String
Dim x As Integer
Dim ws As Worksheet
Application. ScreenUpdating = False
Set ws = Worksheets("SHEET1")
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For x = 16 To 21
Again:
a = ws.Range("A65536").End(xlUp).Row
For i = 5 To a
z = ws.Cells(i, x).Value
If Not z = "" Then
If Not .exists(z) Then
.Add z, Nothing
Else
txt = txt & "," & Cells(i, 1).Address(0, 0)
If Len(txt) > 245 Then
Range(Mid$(txt, 2)).EntireRow.Delete
txt = "": .RemoveAll: Goto Again
End If
End If
End If
Next
If Len(txt) Then Range(Mid$(txt, 2)).EntireRow.Delete
txt = ""
Next x
End With
End Sub
The other things I would like to be able to acheive is
1. for the code to run on any tab name - that is to say the code auto picks up the tab name it is run on rather than me having to rekey different tab names into the code for different worksheet names. NOTE there is only ever one worksheet in the workbook but each time I extract the data from the system the tab name is different so in the code to remove dups I hve so far I ahve to manually change the worksheet name...I would prefer that the code auto detect that if that is possible.
2. Also I have thought it would be good if the code above could also delete some other rows but these arent duplicates - so perhaps a second piece of code to also delete those other rows where ;
-in column G if in any part of the name string contains word "CLASS" (note the sting can contain a upt to ten words and be up to 70 characters in totol lenght ) and there is "X" in same row in any columns of I, J, K and/or IL - then delete the row so long as NO value is present in columns P, R, S, T, and U on the same row - as I want to keep those rows. In attached the test file but I have added to it some test data for this part of it - there are 6 new rows at record numbers 5 to 11 - of these 6 new rows file are highlighted in blue and one in orange - if the code works well - as also doing what the first piece of code did to remove all duplicates from columns P, R, S, T and U the new code would now also be able delete the rows indicated by blue and leave the orange row - of course in the real files the rows would not be coloured.
3. that AFTER ALL require deletions of rows have been completed the final thing is to provide a tally for each worker in column E - of how many rows they have with values appearing in columns I, J, K, L and P, R, S, T, U - now I do have some code as below to this which some one else helped me previosly with for another workbook -which I think with some modification might be able be intigrated into the code required for this overall exercise however I havent been able to achieve it-what do you think ?
Sub test()
Dim a As Variant
Dim b() As Variant
Dim NewWs As Worksheet
Dim j As Integer, k As Integer, i As Integer, ii As Integer
a = Range(Range("IV" & 7).End(xlToLeft), Range("B" & Rows.Count).End(xlUp)).Value
Redim b(1 To UBound(a, 1), 1 To UBound(a, 2))
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For i = 1 To UBound(a, 1): k = 1 'Reset Column Counter
If Not IsEmpty(a(i, 1)) And Not .exists(a(i, 1)) Then
j = j + 1: .Add a(i, 1), j
b(j, 1) = a(i, 1) ' Add Header
End If
For ii = 5 To UBound(a, 2)
Select Case ii
Case 5, 6, 7, 8, 12, 14, 15, 16, 17
k = k + 1
If i = 1 Then b(1, k) = a(i, ii)
If Not a(i, ii) = "" And Not i = 1 Then b(.Item(a(i, 1)), k) = b(.Item(a(i, 1)), k) + 1
End Select
Next ii
Next i
End With
Set NewWs = Worksheets.Add
NewWs.Name = "WORKER TALLY"
NewWs.Range("A1").Resize(j, k).Value = (b)
With NewWs.Range("A1", NewWs.Range("IV" & 1).End(xlToLeft))
.Font.ColorIndex = 6
.Interior.ColorIndex = 11
.Font.Name = "Arial"
.Font.Bold = True
.WrapText = True
.Rows.AutoFit
End With
NewWs.Columns(1).AutoFit
End Sub
I am only a beginner with VBA so your help would be greatly appreciated, plus the result code of course can be shared on this forum.
Regards
Pharlap :-)