How to Find the Line of Best Fit using the Least Square Method

The Line of Best Fit helps us find the best fitting line for a set of points(x,y). In experiments, where x and y have a linear relationship, this can help us draw more accurate results. Let us how to create a line of best fit from the given data.


  •  As usual, considering that the number of points may be large, we use an input file to save the data and read it from. It reads the value of x and y in two columns.
  • We also set a counter for counting the number of sets (x,y).
  • We allocate yfit(n), the total number of points.
  • We calculate all variables required in the formulas for a and b, where each yfit(i) is a+b*x(i), where a is the y-intercept and b is the slope of the line. We then calculate each yfit in a do loop.
  • We write results to file. Additionally, we may plot the results x,y and x,yfit on Gnuplot for comparison.

Now for the Fortran program,

!To find points on a straight line fitting the data

program sqfit
    implicit none
    real sumx,sumy,sumx2,sumxy,xbar,ybar,a,b
    integer n,i
    real x(1000),y(1000)!To read data
    real, allocatable :: yfit(:)
   
   
     n=0
    write (*,15) "x","y"
    15 format (10x,a1,15x,a1)
    open (5,file='file1.dat')
    do i=1,1000
          read (5,*,end=10) x(i),y(i)
          write (*,*) x(i),y(i) !This write statement is in the output anyway
          n=n+1
    end do
   
    10 write(*,*)
      write (*,*) "For the given ",n," sets of points (x,y)"
     
      sumx=sum(x) !To calculate the sums and averages for the main formulas
      sumy=sum(y)
      sumx2=sum(x*x)
      sumxy=sum(x*y)
      xbar=sumx/n
      ybar=sumy/n

      a=((ybar*sumx2)-(xbar*sumxy))/(sumx2-(real(n)*(xbar**2)))!Formula for the y-intercept of the best linear curve
      b=(sumxy-(real(n)*xbar*ybar))/(sumx2-(real(n)*(xbar**2)))!Formula for the slope of the best linear curve
      allocate (yfit(n))
      do i=1,n
      yfit(i)=a+b*x(i)
      end do
     
      write (*,*) "y-intercept, a= ",a !To display the output
      write (*,*)"slope,b= ",b
      write (*,20)"x","y","yfit"
      20 format (10x,a1,15x,a1,13x,a4)
      do i=1,n
              write (*,*)x(i),y(i),yfit(i)
      end do
     
      open (unit=7, file='file2.dat') !To write the output to a file
      do i=1,n
              write (7,*) x(i),yfit(i)
      end do
     
end program sqfit

Sample Output for Least Square Fitting
Gnuplot for Points x,y and Line x,yfit
How to Plot the Two Sets of Data on Gnuplot
To plot the data, change the directory to where you have saved the .dat files. Then, type the following command:
plot 'file1.dat' with points, 'file2.dat' with linespoints
That should do.

No comments:

Post a Comment