I parsed a website with Jsoup and extracted the links. Now I tried to store just a part of that link in an ArrayList. Somehow I cannot store one link at a time.
I tried several String methods, Scanner and BufferedReader without success.
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class DatenImportUnternehmen {
public static void main(String[] args) throws IOException {
ArrayList<String> aktien = new ArrayList<String>();
String searchUrl = "https://www.ariva.de/aktiensuche/_result_table.m";
for(int i = 0; i < 1; i++) {
String searchBody = "page=" + Integer.toString(i) +
"&page_size=25&sort=ariva_name&sort_d=asc
&ariva_performance_1_year=_&ariva_per
formance_3_years=&ariva_performance_5_years=
&index=0&founding_year=&land=0&ind
ustrial_sector=0§or=0¤cy=0
&type_of_share=0&year=_all_years&sales=_&p
rofit_loss=&sum_assets=&sum_liabilities=
&number_of_shares=&earnings_per_share=
÷nd_per_share=&turnover_per_share=
&book_value_per_share=&cashflow_per_sh
are=&balance_sheet_total_per_share=
&number_of_employees=&turnover_per_employee
=_&profit_per_employee=&kgv=_&kuv=_&kbv=_÷nd
_yield=_&return_on_sales=_";
// post request to search URL
Document document =
Jsoup.connect(searchUrl).requestBody(searchBody).post();
// find links in returned HTML
for(Element link:document.select("a[href]")) {
String link1 = link.toString();
String link2 = link1.substring(link1.indexOf('/'));
String link3 = link2.substring(0, link2.indexOf('"'));
aktien.add(link3);
System.out.println(aktien);
}
}
}
}
My output looks like (just a part of it):
[/1-1_drillisch-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st-
_cent-_fox_b_new-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st-
_cent-_fox_b_new-aktie, /21st_century_fox-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st-
_cent-_fox_b_new-aktie, /21st_century_fox-aktie, /2g_energy-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st-
_cent-_fox_b_new-aktie, /21st_century_fox-aktie, /2g_energy-aktie,
/3i_group-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st-
_cent-_fox_b_new-aktie, /21st_century_fox-aktie, /2g_energy-aktie,
/3i_group-aktie, /3i_infrastructure-aktie]
What I want to achieve is:
[/1-1_drillisch-aktie]
[/11_88_0_solutions-aktie]
[/1st_red-aktie]
[/21st-_cent-_fox_b_new-aktie]
and so on.
I just don't now what the problem is at this stage.
Your problem is that you are printing the array whilst adding to it in the loop.
To resolve the issue you can print the array outside of the array to print everything in one go, or you can print link3
(which is what you are adding to the ArrayList), instead of the array in the loop.
Option 1:
for(Element link:document.select("a[href]")) {
String link1 = link.toString();
String link2 = link1.substring(link1.indexOf('/'));
String link3 = link2.substring(0, link2.indexOf('"'));
aktien.add(link3);
}
System.out.println(aktien);
Option 2:
for(Element link:document.select("a[href]")) {
String link1 = link.toString();
String link2 = link1.substring(link1.indexOf('/'));
String link3 = link2.substring(0, link2.indexOf('"'));
aktien.add(link3);
System.out.println(link3);
}