I'm doing a problem that clearly requires usages of Set
but I need to retrieve elements from a set in insertion order + retrieve specific elements.
But finding the specific element was too slow( I guess O(n)
since I have to go through the entire set to find & return it ).
So I went with a LinkedHashMap<someClass,someClass>
where the key-value mappings contained the same objects.
Although this was faster, it used up twice as much memory which is especially concerning if my key/value(both same anyways) happened to take up alot of space.
I'm hoping if anyone has a better solution to my problem or optimizations.
Edit: By the way the comments for this SO answer may help
Edit:
public Set<CompanyDummy> run(int numberOfCompanies)
{
Set<CompanyDummy> companies=new LinkedHashSet<>();
//Create companies
CompanyDummy company;
for(int idx=0;idx<=numberOfCompanies-1;idx++)
{
company=new CompanyDummy(idx);
companies.add(company);
}
//Some code here to fill up each CompanyDummy element with engineers
//At this point,there will be input
//specifying which companies to merge via their index(not reference)
//Problem guarantees those companies exist. Hence, my reason why
//I didn't do something like
//if (set.contains(value)) return value;
//Do note when we merge companies u & v, v is closed down
for(int idx=0;idx<=transactions-1;idx++)
{
companyID= scanner.nextInt();
anotherCompanyID= scanner.nextInt();
//This part is where I search through companies to find if one exists
//which is O(n)
//Replacing sets with maps somehow makes the program faster
//despite LinkedHashSet being backed by LinkedHashMap
company=findCompany(companies, companyID);
anotherCompany=findCompany(companies, anotherCompanyID);
if(company!=null && anotherCompany!=null)
{
company.union(anotherCompany);
companies.remove(anotherCompany);
}
}
}
private CompanyDummy findCompany(Set<CompanyDummy> companies,int id)
{
for(CompanyDummy company : companies)
{
if(company.getIndex()==id)
{
return company;
}
}
return null;
}
}
class CompanyDummy
{
private int index;
private Set<Integer> engineers;//The Integer here just denotes the engineer
public CompanyDummy(int index)
{
this.index=index;
}
public int getindex()
{
return index;
}
public void union(CompanyDummy someCompany)
{
this.engineers.addAll(someCompany.engineers);
}
}
OK, so now that I see the code, I can understand the problem well enough to give you a solution.
You are not simply retrieving a specific element from the set. You are actually retrieving an element with a specific value in a specific field if it is a member of the set. That's why you can't use the set's (or map's) fast lookup operations1.
Solution:
If you want to do the retrieval in better than O(N)
, you need a Map<Integer, CompanyDummy>
. The map must be populated so that it maps from an id
to the CompanyDummy
with that id
as its index
. Then you can replace the findCompany
calls with Map.get
calls.
Note that replacing a Set<CompanyDummy>
with a Map<CompanyDummy, CompanyDummy>
will not help for this problem. Both will give you O(N)
performance.
You say this about your experiment with replacing a LinkedHashSet<CompanyDummy>
with a LinkedHashMap<CompanyDummy, CompanyDummy>
:
Although this was faster, it used up twice as much memory which is especially concerning if my key/value(both same anyways) happened to take up a lot of space.
I doubt that either of those things were actually true.
Space utilization of the two data structures should be identical, given that that a LinkedHashSet
is a actually LinkedHashMap
under the covers.
It is (IMO) unlikely that there was a significant real difference in performance. Certainly not a 2-fold difference. The most likely explanation is that your methodology for measuring and comparing performance does take proper account of possible timing distortions caused by (various) JVM startup overheads; e.g. class loading, heap resizing and garbage collection and JIT compilation.
1 - Just for completeness, if it wasn't a requirement that your set / map is ordered in insertion order, you could use a TreeSet<CompanyDummy>
with a custom Comparator
that orders CompanyDummy
objects by their index
values. You could then use OrderedSet.ceiling
with a dummy CompanyDummy
instances to probe the set in O(logN)
.