javaarraylistjsoupmultiple-entries

How can I store Jsoup output in an ArrayList?


I parsed a website with Jsoup and extracted the links. Now I tried to store just a part of that link in an ArrayList. Somehow I cannot store one link at a time.

I tried several String methods, Scanner and BufferedReader without success.

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

public class DatenImportUnternehmen {


public static void main(String[] args) throws IOException {

    ArrayList<String> aktien = new ArrayList<String>();
    String searchUrl = "https://www.ariva.de/aktiensuche/_result_table.m";


    for(int i = 0; i < 1; i++) {

        String searchBody = "page=" + Integer.toString(i) + 
    "&page_size=25&sort=ariva_name&sort_d=asc 
    &ariva_performance_1_year=_&ariva_per 
    formance_3_years=&ariva_performance_5_years= 
    &index=0&founding_year=&land=0&ind 
    ustrial_sector=0&sector=0&currency=0 
    &type_of_share=0&year=_all_years&sales=_&p 
    rofit_loss=&sum_assets=&sum_liabilities= 
    &number_of_shares=&earnings_per_share= 
    &dividend_per_share=&turnover_per_share= 
    &book_value_per_share=&cashflow_per_sh 
    are=&balance_sheet_total_per_share= 
    &number_of_employees=&turnover_per_employee 
    =_&profit_per_employee=&kgv=_&kuv=_&kbv=_&dividend 
    _yield=_&return_on_sales=_";


    // post request to search URL
    Document document = 
    Jsoup.connect(searchUrl).requestBody(searchBody).post();
    // find links in returned HTML
    for(Element link:document.select("a[href]")) {
        String link1 = link.toString();
        String link2 = link1.substring(link1.indexOf('/'));
        String link3 = link2.substring(0, link2.indexOf('"'));


        aktien.add(link3);

        System.out.println(aktien);

    }
    }


}
}                             

My output looks like (just a part of it):

[/1-1_drillisch-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st- 
_cent-_fox_b_new-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st- 
_cent-_fox_b_new-aktie, /21st_century_fox-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st- 
_cent-_fox_b_new-aktie, /21st_century_fox-aktie, /2g_energy-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st- 
_cent-_fox_b_new-aktie, /21st_century_fox-aktie, /2g_energy-aktie, 
/3i_group-aktie]
[/1-1_drillisch-aktie, /11_88_0_solutions-aktie, /1st_red-aktie, /21st- 
_cent-_fox_b_new-aktie, /21st_century_fox-aktie, /2g_energy-aktie, 
/3i_group-aktie, /3i_infrastructure-aktie] 

What I want to achieve is:

[/1-1_drillisch-aktie]
[/11_88_0_solutions-aktie]
[/1st_red-aktie]
[/21st-_cent-_fox_b_new-aktie]

and so on.

I just don't now what the problem is at this stage.


Solution

  • Your problem is that you are printing the array whilst adding to it in the loop.

    To resolve the issue you can print the array outside of the array to print everything in one go, or you can print link3 (which is what you are adding to the ArrayList), instead of the array in the loop.

    Option 1:

    for(Element link:document.select("a[href]")) {
        String link1 = link.toString();
        String link2 = link1.substring(link1.indexOf('/'));
        String link3 = link2.substring(0, link2.indexOf('"'));
    
        aktien.add(link3);
    }
    System.out.println(aktien);
    

    Option 2:

    for(Element link:document.select("a[href]")) {
        String link1 = link.toString();
        String link2 = link1.substring(link1.indexOf('/'));
        String link3 = link2.substring(0, link2.indexOf('"'));
    
        aktien.add(link3);
        System.out.println(link3);
    }