A long-standing problem in Perl is how to identify a location with finer granularity than a line number. (Follow the link for more information.) This question is about how to get that.
The most promising way to do this is to use the Perl opcode address that is under consideration and deparse the statements around that. And at the level of the a subroutine, B::Deparse will recreate Perl given a code reference. So ideal would be to modify B::Deparse to allow you to give a supplied op to start deparsing. Failing that, it could instead deparse the enclosing subroutine, displaying op-code addresses for each statement encountered. See the code below for an example of this.
B::Concise can show a op-code disassembly for a subroutine. In its disassembly output, it gives addresses, and those addresses it gives match those returned, say, by Devel::Callsite.
The problem is that after instrumenting B::Deparse as done below, the OP addresses it gives does not match those given by B::Concise or Devel::Callsite. Output given below shows this.
I can normalize addresses so that they refer to relative offsets rather than absolute addresses. However this is a lot of work, is gross, and I'm not even totally sure this will work, since Deparse may change code by "pessimizing" or, I guess, undoing optimization.
For concreteness, below is some code that shows the mismatch. Note that none of the addresses given by deparse is shown in the disassembly.
use B::Deparse;
use B::Concise qw(set_style);
sub foo() {
my $x=1; $x+=1;
}
my $deparse = B::Deparse->new("-p", "-l", "-sC");
$body = $deparse->coderef2text(\&foo);
print($body, "\n");
my $walker = B::Concise::compile('-basic', 'foo', \&foo);
B::Concise::set_style_standard('debug');
B::Concise::walk_output(\my $buf);
$walker->(); # walks and renders into $buf;
print($buf);
package B::Deparse;
# Modified to show OP addresses
sub lineseq {
my($self, $root, $cx, @ops) = @_;
my($expr, @exprs);
my $out_cop = $self->{'curcop'};
my $out_seq = defined($out_cop) ? $out_cop->cop_seq : undef;
my $limit_seq;
if (defined $root) {
$limit_seq = $out_seq;
my $nseq;
$nseq = $self->find_scope_st($root->sibling) if ${$root->sibling};
$limit_seq = $nseq if !defined($limit_seq)
or defined($nseq) && $nseq < $limit_seq;
}
$limit_seq = $self->{'limit_seq'}
if defined($self->{'limit_seq'})
&& (!defined($limit_seq) || $self->{'limit_seq'} < $limit_seq);
local $self->{'limit_seq'} = $limit_seq;
my $fn = sub {
my ($text, $i) = @_;
my $op = $ops[$i];
push @exprs, sprintf("# op: 0x%x\n%s ", $op, $text);
};
$self->walk_lineseq($root, \@ops, $fn);
# $self->walk_lineseq($root, \@ops,
# sub { push @exprs, $_[0]} );
my $sep = $cx ? '; ' : ";\n";
my $body = join($sep, grep {length} @exprs);
my $subs = "";
if (defined $root && defined $limit_seq && !$self->{'in_format'}) {
$subs = join "\n", $self->seq_subs($limit_seq);
}
return join($sep, grep {length} $body, $subs);
}
The output I get from running this is:
() {
# op: 0x14a4b30
#line 4 "deparse-so.pl"
(my $x = 1) ;
# op: 0x14a4aa0
#line 4 "deparse-so.pl"
($x += 1) ;
}
main::foo:
UNOP (0xeb9978)
op_next 0
op_sibling 0
op_ppaddr PL_ppaddr[OP_LEAVESUB]
op_type 175
op_flags 4
op_private 65
op_first 0xeab7a0
LISTOP (0xeab7a0)
op_next 0xeb9978
op_sibling 0
op_ppaddr PL_ppaddr[OP_LINESEQ]
op_type 181
op_flags 12
op_private 0
op_first 0xeab7e8
op_last 0xeb9a20
COP (0xeab7e8)
op_next 0xeab890
op_sibling 0xeab848
op_ppaddr PL_ppaddr[OP_NEXTSTATE]
op_type 182
op_flags 1
op_private 0 0
BINOP (0xeab848)
op_next 0xeb99c0
op_sibling 0xeb99c0
op_ppaddr PL_ppaddr[OP_SASSIGN]
op_type 37
op_flags 69
op_private 2
op_first 0xeab890
op_last 0xeab8d0
SVOP (0xeab890)
op_next 0xeab8d0
op_sibling 0xeab8d0
op_ppaddr PL_ppaddr[OP_CONST]
op_type 5
op_flags 2
op_private 0
op_sv 0xea3c40
OP (0xeab8d0)
op_next 0xeab848
op_sibling 0
op_ppaddr PL_ppaddr[OP_PADSV]
op_type 9
op_flags 178
op_private 128
COP (0xeb99c0)
op_next 0xeab768
op_sibling 0xeb9a20
op_ppaddr PL_ppaddr[OP_NEXTSTATE]
op_type 182
op_flags 1
op_private 0 0
BINOP (0xeb9a20)
op_next 0xeb9978
op_sibling 0
op_ppaddr PL_ppaddr[OP_ADD]
op_type 63
op_flags 70
op_private 2
op_first 0xeab768
op_last 0xeb9a68
OP (0xeab768)
op_next 0xeb9a68
op_sibling 0xeb9a68
op_ppaddr PL_ppaddr[OP_PADSV]
op_type 9
op_flags 50
op_private 0
SVOP (0xeb9a68)
op_next 0xeb9a20
op_sibling 0
op_ppaddr PL_ppaddr[OP_CONST]
op_type 5
op_flags 2
op_private 0
op_sv 0xea3c10
B::Concise::compile(CODE(0xea3c70))
UNOP (0xeb9978)
op_next 0
op_sibling 0
op_ppaddr PL_ppaddr[OP_LEAVESUB]
op_type 175
op_flags 4
op_private 65
op_first 0xeab7a0
LISTOP (0xeab7a0)
op_next 0xeb9978
op_sibling 0
op_ppaddr PL_ppaddr[OP_LINESEQ]
op_type 181
op_flags 12
op_private 0
op_first 0xeab7e8
op_last 0xeb9a20
COP (0xeab7e8)
op_next 0xeab890
op_sibling 0xeab848
op_ppaddr PL_ppaddr[OP_NEXTSTATE]
op_type 182
op_flags 1
op_private 0 0
BINOP (0xeab848)
op_next 0xeb99c0
op_sibling 0xeb99c0
op_ppaddr PL_ppaddr[OP_SASSIGN]
op_type 37
op_flags 69
op_private 2
op_first 0xeab890
op_last 0xeab8d0
SVOP (0xeab890)
op_next 0xeab8d0
op_sibling 0xeab8d0
op_ppaddr PL_ppaddr[OP_CONST]
op_type 5
op_flags 2
op_private 0
op_sv 0xea3c40
OP (0xeab8d0)
op_next 0xeab848
op_sibling 0
op_ppaddr PL_ppaddr[OP_PADSV]
op_type 9
op_flags 178
op_private 128
COP (0xeb99c0)
op_next 0xeab768
op_sibling 0xeb9a20
op_ppaddr PL_ppaddr[OP_NEXTSTATE]
op_type 182
op_flags 1
op_private 0 0
BINOP (0xeb9a20)
op_next 0xeb9978
op_sibling 0
op_ppaddr PL_ppaddr[OP_ADD]
op_type 63
op_flags 70
op_private 2
op_first 0xeab768
op_last 0xeb9a68
OP (0xeab768)
op_next 0xeb9a68
op_sibling 0xeb9a68
op_ppaddr PL_ppaddr[OP_PADSV]
op_type 9
op_flags 50
op_private 0
SVOP (0xeb9a68)
op_next 0xeb9a20
op_sibling 0
op_ppaddr PL_ppaddr[OP_CONST]
op_type 5
op_flags 2
op_private 0
op_sv 0xea3c10
Finally, as way of encouraging people to help here, if this is solved, the solution will probably appear in by Perl debugger Devel::Trepan and allow you to reliably know exactly where you are when stopped inside the debugger.
Note: edited to make the question clearer.
ikegami's answer suggestion buried in the comments lead me to find the conceptual flaw I made in my first-proposed solution: inside B::Deparse a lexical array variable stores OPs and those are implicit pointers to the actual code OP structures. Using the undocumented $$
to get the underlying address that the scalar implicitly points to gives the correct address. So in my monkey-patched code of B::Deparse::lineseq, changing:
push @exprs, sprintf("# op: 0x%x\n%s ", $op, $text);
to:
push @exprs, sprintf("# op: 0x%x\n%s ", $$op, $text);
^^
gives me an address that I can use to match up results.
Still, there's a bit of work still to get this usable, so if there are any other ways or suggestions, I'd love to hear them.
Devel::Trepan release 0.70 now makes use in its deparse
command of the above code suitably modified to be able to show which of multiple statements is about to be run.