I'm trying to measure the time it takes for the atmega2560 to perform a matrix multiplication.
For that, I'm using Timer0 in normal mode and counting the amount of overflow interruptions.
I've set up two possible configurations, one to get interrupts every 1ms, and another one to get them every 1us. The problem is: when timing in milliseconds I get 44ms, but when timing in microseconds I get 366us (I think I should be getting around 44,000 at least, or viceversa).
This is the code for the timer0 configuration, to set up the TCNT0 register I followed the formula instructed here, which I also checked with the datasheet: https://arcmicrocontrollers.files.wordpress.com/2022/07/imagen-22.png
/*
*
* Contador con TIMER0
* Normal mode
* Interrupciones. Datasheet capitulo 16.
*
*/
#include "timer.h"
volatile uint16_t timer0_overflow_count = 0;
void timer0_init(uint8_t resolution) {
uint8_t prescaler = 0;
if (resolution == RESOLUTION_MS) {
prescaler = (1 << CS01) | (1 << CS00); // Prescaler 64
TCNT0 = 6;
}
else if (resolution == RESOLUTION_US) {
prescaler = (1 << CS01); // Prescaler 8
TCNT0 = 254;
}
// Habilitar la interrupción de overflow
TIMSK0 |= (1 << TOIE0);
// Iniciar el timer
TCCR0B = prescaler;
// Habilitar interrupciones globales
sei();
}
// Función para obtener el tiempo transcurrido
uint16_t timer0_getCount() {
return timer0_overflow_count;
}
// Rutina de interrupción para el desbordamiento del TIMER0
ISR(TIMER0_OVF_vect) {
timer0_overflow_count++;
}
And this is my main program
/*
*
* El tamaño máximo de memoria disponible para matrices en ram es
* 8 Kb
*
*/
#include <avr/io.h>
#include <stdlib.h>
#include <util/delay.h>
#include <uart.h>
#include <timer.h>
#define F_CPU 16000000
uint8_t const n = 32;
uint8_t* A, * B, * C;
// blink builtin led
void error() {
while (1) {
PORTB |= (1 << 7);
_delay_ms(20);
PORTB &= ~(1 << 7);
_delay_ms(200);
}
}
// perform matrix product
void multiplicar() {
for (uint8_t i = 0; i < n; i++) {
for (uint8_t j = 0; j < n; j++) {
for (uint8_t k = 0; k < n; k++) {
C[i * n + j] += A[i * n + k] * B[j * n + k];
}
}
}
}
// Validates operation result
void validar() {
for (uint16_t i = 0; i < n * n; i++) {
if (C[i] != n) {
error(); //bloqueante
}
}
}
int main() {
DDRB |= (1<<7); // Led
UART_Init();
A=(uint8_t*)malloc(n*n*sizeof(uint8_t));
B=(uint8_t*)malloc(n*n*sizeof(uint8_t));
C=(uint8_t*)malloc(n*n*sizeof(uint8_t));
// Check if there was enough memory
if (A == NULL || B==NULL || C == NULL){
error();
}
// Initialize matrices
for (int i = 0; i < n*n; i++)
{
A[i]=1;
B[i]=1;
C[i]=0;
}
uint8_t res= RESOLUTION_US;
timer0_init(res);
multiplicar();
uint16_t time = timer0_getCount();
validar();
// Print elapsed time
UART_PrintStr("Tiempo transcurrido: ");
UART_PrintNumber(time);
(res == RESOLUTION_MS) ? UART_PrintStr(" ms\n") : UART_PrintStr(" us\n");
while (1) {
}
return 0;
}
It's going to be more reliable and accurate if you measure the run time of the algorithm by just driving a pin high when the function starts, driving it low when the function ends, and looking at the signal on an oscilloscope.
If you really want to use a timer on the AVR to measure the run time, you can do it, but it would be better to avoid using interrupts or to minimize the number of interrupts you have, since every interrupt will take up some CPU cycles, affecting the measurement you are trying to make. So what you would want to do is reset the timer's counter, set the prescaler to a pretty high value to minimize the overhead, run your algorithm, then at the end you would record both the number of overflow interrupts that occurred and the current count of the timer. It's a little bit tricky to avoid bugs due to overflow, but you can use both of those together to compute the run time.
Attempting to run an interrupt once per microsecond on an 8-bit AVR which is probably running at 16 MHz or less is not a good idea. Remember that a 16 MHz AVR can only execute at most 16 instructions in one microsecond, and your simple ISR takes 8 instructions.
The main problem I see with your code is that the only relevant thing you do when you change from milliseconds to microseconds is that you are changing the prescaler by a factor of 8. (Note that your result only changed by a factor of 8, not a factor of 1000 like you wanted.) Your writes to TCNT0 are irrelevant because once the timer reaches its TOP value and overflows, the count in TCNT0 should go back to 0. You need to consult the ATMega2560 datasheet and see what register Timer 0's TOP value is stored in, and write appropriate values to that to set the interrupt rate.
There might be other things you need to fix too. I have not looked thoroughly at the datasheet.