javascriptc++v8

How do I extract the URL element from V8 script object


I've modified v8 slightly to print the URL JS originates from before it's parsed.

Background: I'm also printing the JS itself so as to deobfuscate potential malicious JS step-by-step. jmrk kindly showed me how to do that.

What works Chromium prints object containing the JS URL to stdout: Name: 0x2357009098e1 <String[55]: e"chrome://... module_proxy.js">

What doesn't work How do I reference/extract the URL string in the object: "chrome://..." It's probably simple but my c++ skills are limited?

Code excerpt (v8/src/parsing/parsing.cc)

@@ -16,6 +16,14 @@
  #include "src/parsing/scanner-character-streams.h"
  #include "src/zone/zone-list-inl.h"  // crbug.com/v8/8816
 
+ #include <string>
+ #include <iostream>
+ #include <stdio.h>
+ #include "src/objects/script.h"
+ #include "src/objects/call-site-info.h"
+
  namespace v8 {
  namespace internal {
  namespace parsing {
@@ -50,6 +58,11 @@ bool ParseProgram(ParseInfo* info, Handle<Script> script,
        ScannerStream::For(isolate, source));
    info->set_character_stream(std::move(stream));
 
+  if (script->HasValidSource()) {
+    Handle<Object> source_url(script->GetNameOrSourceURL(), isolate);
+    std::cout << source_url << std::endl;
+  }
+
   Parser parser(isolate->main_thread_local_isolate(), info, script);

Thank you.


Solution

  • std::cout << source_url just calls ShortPrint(source_url, std::cout) that likely calls std::cout << Brief(source_url) (it depends on V8 build settings) that ends up with a call to heap_object->HeapObjectShortPrint(std::cout).

    You can look at the implementation of void HeapObject::HeapObjectShortPrint(std::ostream& os) and use required code parts from void String::StringShortPrint(StringStream* accumulator).